Type: | Package |
Title: | Interpretable Neural Network Based on Generalized Additive Models |
Version: | 2.0.0 |
Maintainer: | Ines Ortega-Fernandez <iortega@gradiant.org> |
Description: | Neural Additive Model framework based on Generalized Additive Models from Hastie & Tibshirani (1990, ISBN:9780412343902), which trains a different neural network to estimate the contribution of each feature to the response variable. The networks are trained independently leveraging the local scoring and backfitting algorithms to ensure that the Generalized Additive Model converges and it is additive. The resultant Neural Network is a highly accurate and interpretable deep learning model, which can be used for high-risk AI practices where decision-making should be based on accountable and interpretable algorithms. |
License: | MPL-2.0 |
BugReports: | https://github.com/inesortega/neuralGAM/issues |
Encoding: | UTF-8 |
Imports: | tensorflow, keras, ggplot2, magrittr, reticulate, formula.tools, matrixStats, patchwork, rlang |
SystemRequirements: | python (>= 3.10), keras (== 2.15), tensorflow (== 2.15) |
RoxygenNote: | 7.3.2 |
Suggests: | covr, testthat (≥ 3.0.0), fs, withr |
Config/testthat/edition: | 3 |
URL: | https://inesortega.github.io/neuralGAM/, https://github.com/inesortega/neuralGAM |
NeedsCompilation: | no |
Packaged: | 2025-10-08 15:30:20 UTC; iortega |
Author: | Ines Ortega-Fernandez
|
Repository: | CRAN |
Date/Publication: | 2025-10-08 15:50:02 UTC |
neuralGAM: Interpretable Neural Network Based on Generalized Additive Models
Description
Neural Additive Model framework based on Generalized Additive Models from Hastie & Tibshirani (1990, ISBN:9780412343902), which trains a different neural network to estimate the contribution of each feature to the response variable. The networks are trained independently leveraging the local scoring and backfitting algorithms to ensure that the Generalized Additive Model converges and it is additive. The resultant Neural Network is a highly accurate and interpretable deep learning model, which can be used for high-risk AI practices where decision-making should be based on accountable and interpretable algorithms.
Author(s)
Maintainer: Ines Ortega-Fernandez iortega@gradiant.org (ORCID) [copyright holder]
Authors:
Marta Sestelo sestelo@uvigo.es (ORCID) [copyright holder]
See Also
Useful links:
Report bugs at https://github.com/inesortega/neuralGAM/issues
Internal helper: combine epistemic and aleatoric uncertainties via mixture sampling
Description
Combine uncertainty estimates from multiple MC Dropout passes where each pass produces quantile bounds and a mean. For each observation, samples are drawn from Normal approximations of aleatoric noise across passes, yielding a predictive mixture distribution.
Usage
.combine_uncertainties_sampling(
lwr_mat,
upr_mat,
mean_mat,
alpha = 0.05,
inner_samples = 50,
centerline = NULL
)
Arguments
lwr_mat |
Matrix |
upr_mat |
Matrix |
mean_mat |
Matrix |
alpha |
Coverage level (default 0.05). |
inner_samples |
Number of Normal samples per pass/observation. |
centerline |
Optional vector of deterministic mean predictions (overrides pass-mean). |
Value
A data.frame
with columns:
-
lwr
,upr
: lower/upper predictive interval. -
var_epistemic
: epistemic variance (across passes). -
var_aleatoric
: average aleatoric variance. -
var_total
: sum of epistemic and aleatoric variances.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Internal helper: combine epistemic and aleatoric via variance decomposition
Description
Classical combination of uncertainties without sampling. Assumes the same
input shapes as .combine_uncertainties_sampling
: each argument is a matrix
of shape [passes, n_obs]
, where rows index MC-Dropout passes and columns
index observations.
For each observation (column):
Epistemic variance = variance across passes of the mean head.
Aleatoric variance = average (across passes) of per-pass variance estimated from quantile width via Normal approximation.
Total variance = epistemic + aleatoric.
Predictive interval = Normal-theory interval around the chosen centerline.
Usage
.combine_uncertainties_variance(
lwr_mat,
upr_mat,
mean_mat,
alpha = 0.05,
centerline = NULL
)
Arguments
lwr_mat |
|
upr_mat |
|
mean_mat |
|
alpha |
Coverage level (default 0.05). |
centerline |
Optional numeric vector (length n_obs) of deterministic mean predictions to use as the PI center. If NULL, uses the across-pass mean. |
Value
data.frame with columns:
lwr, upr: lower/upper predictive interval (Normal-theory)
var_epistemic: variance across passes of mean predictions
var_aleatoric: average per-pass aleatoric variance (from quantile width)
var_total: sum of epistemic and aleatoric variances
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Internal helper: compute uncertainty decomposition (epistemic / aleatoric / both)
Description
Given a fitted Keras submodel and covariate input x
, compute uncertainty
estimates according to the uncertainty_method
.
-
"epistemic"
: estimates only epistemic variance (via MC Dropout passes). -
"aleatoric"
: uses deterministic quantile heads to estimate aleatoric variance. -
"both"
: combines aleatoric and epistemic using variance decomposition. Otherwise: returns
NA
placeholders.
Usage
.compute_uncertainty(model, x, uncertainty_method, alpha, forward_passes)
Arguments
model |
Fitted Keras model for a single smooth term. |
x |
Input covariate matrix (or vector; will be reshaped as needed). |
uncertainty_method |
Character; one of |
alpha |
Coverage level (e.g. 0.05 for 95% bands). |
forward_passes |
Integer; number of MC Dropout passes. |
Value
A data.frame
with columns:
-
lwr
,upr
: lower/upper bounds of interval estimates. -
var_epistemic
: epistemic variance. -
var_aleatoric
: aleatoric variance. -
var_total
: total variance (epistemic + aleatoric).
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Internal helper: joint predictive interval (both) via variance combiner
Description
Internal helper: joint predictive interval (both) via variance combiner
Usage
.joint_pi_both_variance(
ngam,
x,
level = 0.95,
forward_passes = 50,
verbose = 0
)
Internal helper: joint epistemic SE on link scale
Description
Computes joint epistemic standard errors on the link scale by aggregating across all smooth terms via MC Dropout, capturing cross-term covariance. Parametric model uncertainty (from the linear submodel) is added assuming independence from NN-based epistemic uncertainty.
Computes joint epistemic standard errors on the link scale by aggregating across all smooth terms via MC Dropout, capturing cross-term covariance. Parametric model uncertainty (from the linear submodel) is added assuming independence from NN-based epistemic uncertainty.
Usage
.joint_se_eta_mcdropout(ngam, x, forward_passes = 300, verbose = 0)
.joint_se_eta_mcdropout(ngam, x, forward_passes = 300, verbose = 0)
Arguments
ngam |
Fitted |
x |
New data frame of covariates. |
forward_passes |
Number of MC Dropout passes (default 300). |
verbose |
Verbosity (0/1). |
Details
Steps:
Parametric part: mean + variance from linear model.
Nonparametric part: pass-level sums across all smooths.
Joint across-pass variance captures covariance between smooths.
Combined with parametric variance (assumed independent).
Steps:
Parametric part: mean + variance from linear model.
Nonparametric part: pass-level sums across all smooths.
Joint across-pass variance captures covariance between smooths.
Combined with parametric variance (assumed independent).
Value
A numeric vector of length nrow(x)
giving epistemic SEs on the link scale.
A numeric vector of length nrow(x)
giving epistemic SEs on the link scale.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Internal helper: MC Dropout forward sampling
Description
Run passes
stochastic forward passes with Dropout active at prediction time.
Each pass samples a dropout mask and produces predictions, simulating epistemic
uncertainty.
Usage
.mc_dropout_forward(model, x, passes, output_dim)
Arguments
model |
Fitted Keras model for one smooth term. |
x |
Input matrix (converted to TensorFlow tensor internally). |
passes |
Number of stochastic passes (>=2). |
output_dim |
Expected number of outputs per observation (e.g., 1 = mean only, 3 = quantile heads (lwr, upr, mean)). |
Value
A numeric array of shape [passes, n_obs, output_dim]
.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Autoplot method for neuralGAM
objects (epistemic-only)
Description
Produce effect/diagnostic plots from a fitted neuralGAM
model.
Supported panels:
-
which = "response"
: fitted response vs. index, with optional epistemic confidence intervals (CI). -
which = "link"
: linear predictor (link scale) vs. index, with optional CI. -
which = "terms"
: single per-term contributiong_j(x_j)
on the link scale, with optional CI band for the smooth (epistemic).
Usage
## S3 method for class 'neuralGAM'
autoplot(
object,
newdata = NULL,
which = c("response", "link", "terms"),
interval = c("none", "confidence"),
level = 0.95,
forward_passes = 150,
term = NULL,
rug = TRUE,
...
)
Arguments
object |
A fitted |
newdata |
Optional |
which |
One of |
interval |
One of |
level |
Coverage level for confidence intervals (e.g., |
forward_passes |
Integer. Number of MC-dropout forward passes used when
|
term |
Single term name to plot when |
rug |
Logical; if |
... |
Additional arguments passed to |
Details
Uncertainty semantics (epistemic only)
-
CI: Uncertainty about the fitted mean.
For the response, SEs are mapped via the delta method;
For terms, bands are obtained as
\hat g_j \pm z \cdot SE(\hat g_j)
on the link scale.
Value
A single ggplot
object.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Examples
## Not run:
library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test <- dat$test
ngam <- neuralGAM(
y ~ s(x1) + x2 + s(x3),
data = train, family = "gaussian", num_units = 128,
uncertainty_method = "epistemic", forward_passes = 10
)
## --- Autoplot (epistemic-only) ---
# Per-term effect with CI band
autoplot(ngam, which = "terms", term = "x1", interval = "confidence") +
ggplot2::xlab("x1") + ggplot2::ylab("Partial effect")
# Request a different number of forward passes or CI level:
autoplot(ngam, which = "terms", term = "x1", interval = "confidence",
forward_passes = 15, level = 0.7)
# Response panel
autoplot(ngam, which = "response")
# Link panel with custom title
autoplot(ngam, which = "link") +
ggplot2::ggtitle("Main Title")
## End(Not run)
Build and compile a neural network feature model
Description
Builds and compiles a keras
neural network for a single smooth term in a
neuralGAM
model.
The network can optionally be configured to output symmetric prediction intervals
(lower bound, upper bound, and mean prediction) using a custom quantile loss
(make_quantile_loss()
), or a standard single-output point prediction using
any user-specified loss function.
When uncertainty_method
is aleatoric
or both
the model outputs three units corresponding to the
lower bound, upper bound, and mean prediction, and is compiled with
make_quantile_loss(alpha, mean_loss, ...)
. In any other case, the model
outputs a single unit (point prediction) and uses the loss function provided in loss
.
Usage
build_feature_NN(
num_units,
learning_rate = 0.001,
activation = "relu",
kernel_initializer = "glorot_normal",
kernel_regularizer = NULL,
bias_regularizer = NULL,
bias_initializer = "zeros",
activity_regularizer = NULL,
loss = "mse",
name = NULL,
alpha = 0.05,
w_mean = 0.1,
order_penalty_lambda = 0,
uncertainty_method = "none",
dropout_rate = 0.1,
seed = NULL,
...
)
Arguments
num_units |
Integer or vector of integers. Number of units in the hidden layer(s). If a vector is provided, multiple dense layers are added sequentially. |
learning_rate |
Numeric. Learning rate for the Adam optimizer. |
activation |
Character string or function. Activation function to use in hidden layers.
If character, it must be valid for |
kernel_initializer |
Keras initializer object or string. Kernel initializer for dense layers. |
kernel_regularizer |
Optional Keras regularizer for kernel weights. |
bias_regularizer |
Optional Keras regularizer for bias terms. |
bias_initializer |
Keras initializer object or string. Initializer for bias terms. |
activity_regularizer |
Optional Keras regularizer for layer activations. |
loss |
Loss function to use.
|
name |
Optional character string. Name assigned to the model. |
alpha |
Numeric. Desired significance level for symmetric prediction intervals. Defaults to 0.05 (i.e., 95% PI using quantiles alpha/2 and 1-alpha/2). |
w_mean |
Non-negative numeric. Weight for the mean-head loss within the composite PI loss. |
order_penalty_lambda |
Non-negative numeric. Strength of a soft monotonicity penalty
|
uncertainty_method |
Character string indicating the type of uncertainty to estimate in prediction intervals.
Must be one of |
dropout_rate |
Numeric in (0,1). Dropout rate used when |
seed |
Random seed. |
... |
Arguments passed on to
|
Details
Prediction interval mode (uncertainty_method %in% c("aleatoric", "both")
):
Output layer has 3 units:
-
lwr
: lower bound,\tau = \alpha/2
-
upr
: upper bound,\tau = 1 - \alpha/2
-
y_hat
: mean prediction
-
Loss function is
make_quantile_loss()
which combines two pinball losses (for lower and upper quantiles) with the chosen mean prediction loss and an optional non-crossing penalty.
Point prediction mode (uncertainty_method %in% c("none", "epistemic")
):
Output layer has 1 unit: point prediction only.
Loss function is the one passed in
loss
.
Value
A compiled keras_model
object ready for training.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
References
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980. Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica, 46(1), 33-50.
Deviance of the model
Description
Computes the deviance of the model according to the distribution
family specified in the "family"
parameter.
Usage
dev(muhat, y, family, w = NULL)
Arguments
muhat |
current estimation of the response variable |
y |
response variable |
family |
A description of the link function used in the model:
|
w |
weight assigned to each observation. Defaults to 1. |
Value
the deviance of the model
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Diagnosis plots to evaluate a fitted neuralGAM
model.
Description
Produce a 2x2 diagnostic panel for a fitted neuralGAM
model, mirroring
the layout of gratia's appraise()
for mgcv GAMs:
(top-left) a QQ plot of residuals with optional simulation envelope,
(top-right) a histogram of residuals,
(bottom-left) residuals vs linear predictor \eta
, and
(bottom-right) observed vs fitted values on the response scale.
Usage
diagnose(
object,
data = NULL,
response = NULL,
qq_method = c("uniform", "simulate", "normal"),
n_uniform = 1000,
n_simulate = 200,
residual_type = c("deviance", "pearson", "quantile"),
level = 0.95,
point_col = "steelblue",
point_alpha = 0.5,
hist_bins = 30
)
Arguments
object |
A fitted |
data |
Optional |
response |
Character scalar giving the response variable name in
|
qq_method |
Character; one of |
n_uniform |
Integer; number of |
n_simulate |
Integer; number of simulated datasets for
|
residual_type |
One of |
level |
Numeric in (0,1); coverage level for the QQ bands when
|
point_col |
Character; colour for points in scatter/histogram panels. |
point_alpha |
Numeric in (0,1); point transparency. |
hist_bins |
Integer; number of bins in the histogram. |
Details
The function uses predict.neuralGAM()
to obtain the linear
predictor (type = "link"
) and the fitted mean on the response scale
(type = "response"
). Residuals are computed internally for supported
families; by default we use deviance residuals:
-
Gaussian:
r_i = y_i - \hat{\mu}_i
. -
Binomial:
r_i = \mathrm{sign}(y_i-\hat{\mu}_i)\, \sqrt{2 w_i \{ y_i \log(y_i/\hat{\mu}_i) + (1-y_i)\log[(1-y_i)/(1-\hat{\mu}_i)] \}}
, with optional per-observation weightsw_i
(e.g., trials for proportions). -
Poisson:
r_i = \mathrm{sign}(y_i-\hat{\mu}_i)\, \sqrt{2 w_i \{ y_i \log(y_i/\hat{\mu}_i) - (y_i-\hat{\mu}_i) \}}
, adopting the conventiony_i \log(y_i/\hat{\mu}_i)=0
wheny_i=0
.
For Gaussian models, these plots diagnose symmetry, tail behaviour, and mean/variance misfit similar to standard GLM/GAM diagnostics. For non-Gaussian families (Binomial, Poisson), interpret shapes on the deviance scale, which is approximately normal under a well-specified model. For discrete data, randomized quantile (Dunn-Smyth) residuals are also available and often yield smoother QQ behaviour.
QQ reference methods.
qq_method
controls how theoretical quantiles are generated (as in gratia):
-
"uniform"
(default): drawU(0,1)
and map through the inverse CDF of the fitted response distribution at each observation; convert to residuals and average the sorted curves overn_uniform
draws. Fast and respects the mean-variance relationship. -
"simulate"
: simulaten_simulate
datasets from the fitted model at the observed covariates, compute residuals, and average the sorted curves; also provides pointwiselevel
bands on the QQ plot. -
"normal"
: use standard normal quantiles; a fallback when a suitable RNG or inverse CDF is unavailable.
For Poisson models, include offsets for exposure in the linear predictor
(e.g., log(E)
). The QQ methods use \hat{\mu}_i
with
qpois
/rpois
for "uniform"
/"simulate"
, respectively.
Value
A patchwork object combining four ggplot2 plots. You can print it, add titles/themes, or extract individual panels if needed.
Dependencies
Requires ggplot2 and patchwork.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
References
Augustin, N.H., Sauleau, E.A., Wood, S.N. (2012). On quantile-quantile plots for generalized linear models. Computational Statistics & Data Analysis, 56, 2404-2409. https://doi.org/10.1016/j.csda.2012.01.026
Dunn, P.K., Smyth, G.K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5(3), 236-244.
Derivative of the link function
Description
Computes the derivative of the link function according to
the distribution family specified in the "family"
parameter.
Usage
diriv(family, muhat)
Arguments
family |
A description of the link function used in the model:
|
muhat |
fitted values |
Value
derivative of the link function for the fitted values
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Extract structured elements from a model formula
Description
Parses a GAM-style formula and separates the response, all terms, smooth
(non‑parametric) terms declared via s(...)
, and parametric terms. In addition,
it extracts per‑term neural network specifications that can be written
inline inside each s(...)
call (e.g., num_units
, activation
,
kernel_initializer
, bias_initializer
, kernel_regularizer
,
bias_regularizer
, activity_regularizer
).
This function uses an abstract syntax tree (AST) walker (no regex) to read
the arguments of each s(var, ...)
. Arguments are evaluated in the caller’s
environment. For *_regularizer
and *_initializer
arguments, proper Keras
objects are required (e.g., keras::regularizer_l2(1e-4)
,
keras::initializer_glorot_uniform()
).
Usage
get_formula_elements(formula)
Arguments
formula |
A model formula. Smooth terms must be written as |
Details
Inline per‑term configuration in s(...)
.
You can specify neural network hyperparameters per smooth term, e.g.:
y ~ s(x1, num_units = c(1024, 512), activation = "tanh", kernel_regularizer = keras::regularizer_l2(1e-4)) + x2 + s(x3, num_units = 1024, bias_initializer = keras::initializer_zeros())
Values are evaluated in the caller’s environment. For regularizers and initializers you must pass actual Keras objects (not character strings).
Supported keys.
Only the keys listed in np_architecture
above are recognized.
Unknown keys are ignored with a warning.
Typical usage.
The returned np_terms
and np_architecture
are consumed by
model-building code to construct one neural network per smooth term, applying
any per‑term overrides while falling back to global defaults for unspecified keys.
Value
A list with the following elements:
- y
Character scalar. The response variable name.
- terms
Character vector with all variable names on the RHS (both smooth and parametric).
- np_terms
Character vector with smooth (non‑parametric) variable names extracted from
s(...)
.- p_terms
Character vector with parametric terms (i.e.,
terms \\ np_terms
).- np_formula
A formula containing only the
s(...)
terms (orNULL
if none).- p_formula
A formula containing only the parametric terms (or
NULL
if none).- np_architecture
Named list keyed by smooth term (e.g.,
"x1"
,"x3"
). Each entry is a list of per‑term settings parsed froms(term, ...)
. Supported keys include:num_units
(numeric or numeric vector),activation
(character),learning_rate
(numeric),kernel_initializer
(keras initializer object),bias_initializer
(keras initializer object),kernel_regularizer
(keras regularizer object),bias_regularizer
(keras regularizer object),activity_regularizer
(keras regularizer object).- formula
The original formula object.
Errors and validation
The first argument of each
s(...)
must be a symbol naming the variable.All additional arguments to
s(...)
must be named.-
*_regularizer
must be a Keras regularizer object. -
*_initializer
must be a Keras initializer object.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Install neuralGAM python requirements
Description
Creates a conda environment (installing miniconda if required) and set ups the Python requirements to run neuralGAM (Tensorflow and Keras).
Miniconda and related environments are generated in the user's cache directory given by:
tools::R_user_dir('neuralGAM', 'cache')
Usage
install_neuralGAM()
Inverse of the link functions
Description
Computes the inverse of the link function according to the
distribution family specified in the "family"
parameter.
Usage
inv_link(family, muhat)
Arguments
family |
A description of the link function used in the model:
|
muhat |
fitted values |
Value
the inverse link function specified by the "family"
distribution for the given fitted values
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Link function
Description
Applies the link function according to the distribution family
specified in the "family"
parameter.
Usage
link(family, muhat)
Arguments
family |
A description of the link function used in the model:
|
muhat |
fitted values |
Value
the link function specified by the "family"
distribution
for the given fitted values
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Ines Ortega-Fernandez, Marta Sestelo
Derivative of the Inverse Link Function
Description
Computes the derivative of the inverse link function d\mu/d\eta
for common distributtion families supported by neuralGAM
("gaussian"
, "binomial"
,
"poisson"
). This quantity is required when applying the delta method to obtain
standard errors on the response scale in predict()
.
Usage
mu_eta(family, eta)
Arguments
family |
A character string specifying the distribution family:
one of |
eta |
Numeric vector of linear predictor values. |
Details
For a neuralGAM with linear predictor \eta
and mean response \mu
:
\mu = g^{-1}(\eta),
the derivative d\mu/d\eta
depends on the family:
Gaussian (identity link):
d\mu/d\eta = 1
.Binomial (logit link):
d\mu/d\eta = \mu (1-\mu)
.Poisson (log link):
d\mu/d\eta = \mu
.
Internally, values of \eta
are clamped to avoid numerical
overflow/underflow in exp()
and \mu
is constrained away
from 0
and 1
for stability.
Value
A numeric vector of the same length as eta
, containing
the derivative d\mu/d\eta
.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
See Also
Fit a neuralGAM model
Description
Fits a Generalized Additive Model where smooth terms are modeled by keras
neural networks.
In addition to point predictions, the model can optionally estimate uncertainty bands via Monte Carlo Dropout across forward passes.
Usage
neuralGAM(
formula,
data,
family = "gaussian",
num_units = 64,
learning_rate = 0.001,
activation = "relu",
kernel_initializer = "glorot_normal",
kernel_regularizer = NULL,
bias_regularizer = NULL,
bias_initializer = "zeros",
activity_regularizer = NULL,
loss = "mse",
uncertainty_method = c("none", "epistemic"),
alpha = 0.05,
forward_passes = 100,
dropout_rate = 0.1,
validation_split = NULL,
w_train = NULL,
bf_threshold = 0.001,
ls_threshold = 0.1,
max_iter_backfitting = 10,
max_iter_ls = 10,
seed = NULL,
verbose = 1,
...
)
Arguments
formula |
Model formula. Smooth terms must be wrapped in |
data |
Data frame containing the variables. |
family |
Response distribution: |
num_units |
Default hidden layer sizes for smooth terms (integer or vector).
Mandatory unless every |
learning_rate |
Learning rate for Adam optimizer. |
activation |
Activation function for hidden layers. Either a string understood by
|
kernel_initializer , bias_initializer |
Initializers for weights and biases. |
kernel_regularizer , bias_regularizer , activity_regularizer |
Optional Keras regularizers. |
loss |
Loss function to use. Can be any Keras built-in (e.g., |
uncertainty_method |
Character string indicating the type of uncertainty to estimate. One of:
|
alpha |
Significance level for prediction intervals, e.g. |
forward_passes |
Integer. Number of MC-dropout forward passes used when
|
dropout_rate |
Dropout probability in smooth-term NNs (0,1).
|
validation_split |
Optional fraction of training data used for validation. |
w_train |
Optional training weights. |
bf_threshold |
Convergence criterion of the backfitting algorithm. Defaults to |
ls_threshold |
Convergence criterion of the local scoring algorithm. Defaults to |
max_iter_backfitting |
An integer with the maximum number of iterations
of the backfitting algorithm. Defaults to |
max_iter_ls |
An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to |
seed |
Random seed. |
verbose |
Verbosity: |
... |
Additional arguments passed to |
Value
An object of class "neuralGAM"
, a list with elements including:
- muhat
Numeric vector of fitted mean predictions (training data).
- partial
Data frame of partial contributions
g_j(x_j)
per smooth term.- y
Observed response values.
- eta
Linear predictor
\eta = \eta_0 + \sum_j g_j(x_j)
.- lwr,upr
Lower/upper confidence interval bounds (response scale)
- x
Training covariates (inputs).
- model
List of fitted Keras models, one per smooth term (+
"linear"
if present).- eta0
Intercept estimate
\eta_0
.- family
Model family.
- stats
Data frame of training/validation losses per backfitting iteration.
- mse
Training mean squared error.
- formula
Parsed model formula (via
get_formula_elements()
).- history
List of Keras training histories per term.
- globals
Global hyperparameter defaults.
- alpha
PI significance level (if trained with uncertainty).
- build_pi
Logical; whether the model was trained with uancertainty estimation enabled
- uncertainty_method
Type of predictive uncertainty used ("none","epistemic").
- var_epistemic
Matrix of per-term epistemic variances (if computed).
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Examples
## Not run:
library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test <- dat$test
# Per-term architecture and confidence intervals
ngam <- neuralGAM(
y ~ s(x1, num_units = c(128,64), activation = "tanh") +
s(x2, num_units = 256),
data = train,
uncertainty_method = "epistemic",
forward_passes = 10,
alpha = 0.05
)
ngam
## End(Not run)
Visualization of neuralGAM
object with base graphics
Description
Visualization of a fitted neuralGAM
. Plots learned partial effects, either as
scatter/line plots for continuous covariates or s for factor covariates.
Confidence and/or prediction intervals can be added if available.
Usage
## S3 method for class 'neuralGAM'
plot(
x,
select = NULL,
xlab = NULL,
ylab = NULL,
interval = c("none", "confidence", "prediction", "both"),
level = 0.95,
...
)
Arguments
x |
A fitted |
select |
Character vector of terms to plot. If |
xlab |
Optional custom x-axis label(s). |
ylab |
Optional custom y-axis label(s). |
interval |
One of |
level |
Coverage level for intervals (e.g. |
... |
Additional graphical arguments passed to |
Value
Produces plots on the current graphics device.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Plot training loss history for a neuralGAM model
Description
This function visualizes the training and/or validation loss at the end of each backfitting iteration
for each term-specific model in a fitted neuralGAM
object. It is designed to work with the
history
component of a trained neuralGAM
model.
Usage
plot_history(model, select = NULL, metric = c("loss", "val_loss"))
Arguments
model |
A fitted |
select |
Optional character vector of term names (e.g. |
metric |
Character vector indicating which loss metric(s) to plot. Options are
|
Value
A ggplot
object showing the loss curves by backfitting iteration, with facets per term.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Examples
## Not run:
set.seed(123)
n <- 200
x1 <- runif(n, -2, 2)
x2 <- runif(n, -2, 2)
y <- 2 + x1^2 + sin(x2) + rnorm(n, 0, 0.1)
df <- data.frame(x1 = x1, x2 = x2, y = y)
model <- neuralGAM::neuralGAM(
y ~ s(x1) + s(x2),
data = df,
num_units = 8,
family = "gaussian",
max_iter_backfitting = 2,
max_iter_ls = 1,
learning_rate = 0.01,
seed = 42,
validation_split = 0.2,
verbose = 0
)
plot_history(model) # Plot all terms
plot_history(model, select = "x1") # Plot just x1
plot_history(model, metric = "val_loss") # Plot only validation loss
## End(Not run)
Produces predictions from a fitted neuralGAM
object
Description
Generate predictions from a fitted neuralGAM
model. Supported types:
-
type = "link"
(default): linear predictor on the link scale. -
type = "response"
: predictions on the response scale. -
type = "terms"
: per-term contributions to the linear predictor (no intercept).
Uncertainty estimation via MC Dropout (epistemic only)
If
se.fit = TRUE
, standard errors (SE) of the fitted mean are returned (mgcv-style via Monte Carlo Dropout).For
type = "response"
, SEs are mapped to the response scale by the delta method:se_\mu = |d\mu/d\eta| \cdot se_\eta
.-
interval = "confidence"
returns CI bands derived from SEs; prediction intervals are not supported. For
type = "terms"
,interval="confidence"
returns per-term CI matrices (andse.fit
when requested).
Details
Epistemic SEs (CIs) are obtained via Monte Carlo Dropout. When
type != "terms"
and SEs/CIs are requested in the presence of smooth terms, uncertainty is aggregated jointly to capture cross-term covariance in a single MC pass set. Otherwise, per-term variances are used (parametric variances are obtained fromstats::predict(..., se.fit=TRUE)
).For
type="terms"
, epistemic SEs and CI matrices are returned when requested.PIs are not defined on the link scale and are not supported.
Usage
## S3 method for class 'neuralGAM'
predict(
object,
newdata = NULL,
type = c("link", "response", "terms"),
terms = NULL,
se.fit = FALSE,
interval = c("none", "confidence"),
level = 0.95,
forward_passes = 150,
verbose = 1,
...
)
Arguments
object |
A fitted |
newdata |
Optional |
type |
One of |
terms |
If |
se.fit |
Logical; if |
interval |
One of |
level |
Coverage level for confidence intervals (e.g., |
forward_passes |
Integer; number of MC-dropout forward passes when computing epistemic uncertainty. |
verbose |
Integer (0/1). Default |
... |
Other options (passed on to internal predictors). |
Value
-
type="terms"
:-
interval="none"
: matrix of per-term contributions; ifse.fit=TRUE
, a list with$fit
,$se.fit
. -
interval="confidence"
: a list with matrices$fit
,$se.fit
,$lwr
,$upr
.
-
-
type="link"
ortype="response"
:-
interval="none"
: vector (or list with$fit
,$se.fit
ifse.fit=TRUE
). -
interval="confidence"
: data.frame withfit
,lwr
,upr
.
-
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Examples
## Not run:
library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test <- dat$test
ngam0 <- neuralGAM(
y ~ s(x1) + x2 + s(x3),
data = train, family = "gaussian",
num_units = 128, uncertainty_method = "epistemic"
)
link_ci <- predict(ngam0, type = "link", interval = "confidence",
level = 0.95, forward_passes = 10)
resp_ci <- predict(ngam0, type = "response", interval = "confidence",
level = 0.95, forward_passes = 10)
trm_se <- predict(ngam0, type = "terms",
se.fit = TRUE, forward_passes = 10)
## End(Not run)
Short neuralGAM
summary
Description
Default print method for a neuralGAM
object.
Usage
## S3 method for class 'neuralGAM'
print(x, ...)
Arguments
x |
A |
... |
Additional arguments (currently unused). |
Value
Prints a brief summary of the fitted model including:
- Distribution family
The distribution family used (
"gaussian"
,"binomial"
, or"poisson"
).- Formula
The model formula.
- Intercept value
The fitted intercept (
\eta_0
).- Mean Squared Error (MSE)
The training MSE of the model.
- Training sample size
The number of observations used to train the model.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Examples
## Not run:
library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test <- dat$test
ngam <- neuralGAM(
y ~ s(x1) + x2 + s(x3),
data = train,
num_units = 128,
family = "gaussian",
activation = "relu",
learning_rate = 0.001,
bf_threshold = 0.001,
max_iter_backfitting = 10,
max_iter_ls = 10,
seed = 1234
)
print(ngam)
## End(Not run)
Objects exported from other packages
Description
These objects are imported from other packages. Follow the links below to see their documentation.
- ggplot2
Simulate Example Data for NeuralGAM
Description
Generate a synthetic dataset for demonstrating and testing
neuralGAM
. The response is constructed from three covariates:
a quadratic effect, a linear effect, and a sinusoidal effect, plus Gaussian noise.
Usage
sim_neuralGAM_data(n = 2000, seed = 42, test_prop = 0.3)
Arguments
n |
Integer. Number of observations to generate. Default |
seed |
Integer. Random seed for reproducibility. Default |
test_prop |
Numeric in |
Details
The data generating process is:
y = 2 + x1^2 + 2 x2 + \sin(x3) + \varepsilon,
where \varepsilon \sim N(0, 0.25^2)
.
Covariates x1
, x2
, x3
are drawn independently from
U(-2.5, 2.5)
.
Value
A list with two elements:
-
train
: data.frame with training data. -
test
: data.frame with test data.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Examples
## Not run:
set.seed(123)
dat <- sim_neuralGAM_data(n = 500, test_prop = 0.2)
train <- dat$train
test <- dat$test
## End(Not run)
Summary of a neuralGAM
model
Description
Summarizes a fitted neuralGAM
object: family, formula, sample size,
intercept, training MSE, per-term neural net settings, per-term NN layer
configuration, and training history. If a linear component is present, its
coefficients are also reported.
Usage
## S3 method for class 'neuralGAM'
summary(object, ...)
Arguments
object |
A |
... |
Additional arguments (currently unused). |
Value
Invisibly returns object
. Prints a human-readable summary.
Author(s)
Ines Ortega-Fernandez, Marta Sestelo
Examples
## Not run:
library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test <- dat$test
ngam <- neuralGAM(
y ~ s(x1) + x2 + s(x3),
data = train,
num_units = 128,
family = "gaussian",
activation = "relu",
learning_rate = 0.001,
bf_threshold = 0.001,
max_iter_backfitting = 10,
max_iter_ls = 10,
seed = 1234
)
summary(ngam)
## End(Not run)
Validate/resolve a Keras activation
Description
Validate/resolve a Keras activation
Usage
validate_activation(activation)
Arguments
activation |
character or function. If character, must be a valid tf.keras activation identifier (e.g., "relu", "gelu", "swish", ...). |
Value
a callable activation (Python callable) or the original R function.
Examples
## Not run:
library(neuralGAM)
act <- neuralGAM:::validate_activation("relu") # ok
act <- neuralGAM:::validate_activation(function(x) x) # custom
## End(Not run)
Validate/resolve a Keras loss
Description
Validate/resolve a Keras loss
Usage
validate_loss(loss)
Arguments
loss |
character or function. If character, must be a valid tf.keras loss identifier (e.g., "mse", "mae", "huber", "logcosh", ...). |
Value
a callable loss (Python callable) or the original R function.
Examples
## Not run:
library(neuralGAM)
L <- neuralGAM:::validate_loss("huber") # ok (Huber with default delta)
L <- neuralGAM:::validate_loss(function(y,t) tensorflow::tf$reduce_mean((y-t)^2)) # custom
## End(Not run)
Weights
Description
Computes the weights for the Local Scoring Algorithm.
Usage
weight(w, muhat, family)
Arguments
w |
weights |
muhat |
fitted values |
family |
A description of the link function used in the model:
|
Value
computed weights for the Local Scoring algorithm
according to the "family"
distribution
Author(s)
Ines Ortega-Fernandez, Marta Sestelo.
Ines Ortega-Fernandez, Marta Sestelo