Help for package neuralGAM

Type:

Package

Title:

Interpretable Neural Network Based on Generalized Additive Models

Version:

2.0.0

Maintainer:

Ines Ortega-Fernandez <iortega@gradiant.org>

Description:

Neural Additive Model framework based on Generalized Additive Models from Hastie & Tibshirani (1990, ISBN:9780412343902), which trains a different neural network to estimate the contribution of each feature to the response variable. The networks are trained independently leveraging the local scoring and backfitting algorithms to ensure that the Generalized Additive Model converges and it is additive. The resultant Neural Network is a highly accurate and interpretable deep learning model, which can be used for high-risk AI practices where decision-making should be based on accountable and interpretable algorithms.

License:

MPL-2.0

BugReports:

https://github.com/inesortega/neuralGAM/issues

Encoding:

UTF-8

Imports:

tensorflow, keras, ggplot2, magrittr, reticulate, formula.tools, matrixStats, patchwork, rlang

SystemRequirements:

python (>= 3.10), keras (== 2.15), tensorflow (== 2.15)

RoxygenNote:

7.3.2

Suggests:

covr, testthat (≥ 3.0.0), fs, withr

Config/testthat/edition:

URL:

https://inesortega.github.io/neuralGAM/, https://github.com/inesortega/neuralGAM

NeedsCompilation:

Packaged:

2025-10-08 15:30:20 UTC; iortega

Author:

Ines Ortega-Fernandez

[aut, cre, cph], Marta Sestelo

[aut, cph]

Repository:

CRAN

Date/Publication:

2025-10-08 15:50:02 UTC

neuralGAM: Interpretable Neural Network Based on Generalized Additive Models

Description

Author(s)

Maintainer: Ines Ortega-Fernandez iortega@gradiant.org (ORCID) [copyright holder]

Authors:

Marta Sestelo sestelo@uvigo.es (ORCID) [copyright holder]

Internal helper: combine epistemic and aleatoric uncertainties via mixture sampling

Description

Combine uncertainty estimates from multiple MC Dropout passes where each pass produces quantile bounds and a mean. For each observation, samples are drawn from Normal approximations of aleatoric noise across passes, yielding a predictive mixture distribution.

Usage

.combine_uncertainties_sampling(
  lwr_mat,
  upr_mat,
  mean_mat,
  alpha = 0.05,
  inner_samples = 50,
  centerline = NULL
)

Arguments

lwr_mat

Matrix [passes, n_obs] of lower quantile predictions.

upr_mat

Matrix [passes, n_obs] of upper quantile predictions.

mean_mat

Matrix [passes, n_obs] of mean predictions.

alpha

Coverage level (default 0.05).

inner_samples

Number of Normal samples per pass/observation.

centerline

Optional vector of deterministic mean predictions (overrides pass-mean).

Value

A data.frame with columns:

lwr, upr: lower/upper predictive interval.
var_epistemic: epistemic variance (across passes).
var_aleatoric: average aleatoric variance.
var_total: sum of epistemic and aleatoric variances.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Internal helper: combine epistemic and aleatoric via variance decomposition

Description

Classical combination of uncertainties without sampling. Assumes the same input shapes as .combine_uncertainties_sampling: each argument is a matrix of shape [passes, n_obs], where rows index MC-Dropout passes and columns index observations.

For each observation (column):

Epistemic variance = variance across passes of the mean head.
Aleatoric variance = average (across passes) of per-pass variance estimated from quantile width via Normal approximation.
Total variance = epistemic + aleatoric.
Predictive interval = Normal-theory interval around the chosen centerline.

Usage

.combine_uncertainties_variance(
  lwr_mat,
  upr_mat,
  mean_mat,
  alpha = 0.05,
  centerline = NULL
)

Arguments

lwr_mat

[passes, n_obs] lower-quantile predictions per pass.

upr_mat

[passes, n_obs] upper-quantile predictions per pass.

mean_mat

[passes, n_obs] mean-head predictions per pass.

alpha

Coverage level (default 0.05).

centerline

Optional numeric vector (length n_obs) of deterministic mean predictions to use as the PI center. If NULL, uses the across-pass mean.

Value

data.frame with columns:

lwr, upr: lower/upper predictive interval (Normal-theory)
var_epistemic: variance across passes of mean predictions
var_aleatoric: average per-pass aleatoric variance (from quantile width)
var_total: sum of epistemic and aleatoric variances

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Internal helper: compute uncertainty decomposition (epistemic / aleatoric / both)

Description

Given a fitted Keras submodel and covariate input x, compute uncertainty estimates according to the uncertainty_method.

"epistemic": estimates only epistemic variance (via MC Dropout passes).
"aleatoric": uses deterministic quantile heads to estimate aleatoric variance.
"both": combines aleatoric and epistemic using variance decomposition.
Otherwise: returns NA placeholders.

Usage

.compute_uncertainty(model, x, uncertainty_method, alpha, forward_passes)

Arguments

model

Fitted Keras model for a single smooth term.

x

Input covariate matrix (or vector; will be reshaped as needed).

uncertainty_method

Character; one of "epistemic", "aleatoric", "both", or "none".

alpha

Coverage level (e.g. 0.05 for 95% bands).

forward_passes

Integer; number of MC Dropout passes.

Value

A data.frame with columns:

lwr, upr: lower/upper bounds of interval estimates.
var_epistemic: epistemic variance.
var_aleatoric: aleatoric variance.
var_total: total variance (epistemic + aleatoric).

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Internal helper: joint predictive interval (both) via variance combiner

Description

Internal helper: joint predictive interval (both) via variance combiner

Usage

.joint_pi_both_variance(
  ngam,
  x,
  level = 0.95,
  forward_passes = 50,
  verbose = 0
)

Internal helper: joint epistemic SE on link scale

Description

Computes joint epistemic standard errors on the link scale by aggregating across all smooth terms via MC Dropout, capturing cross-term covariance. Parametric model uncertainty (from the linear submodel) is added assuming independence from NN-based epistemic uncertainty.

Usage

.joint_se_eta_mcdropout(ngam, x, forward_passes = 300, verbose = 0)

.joint_se_eta_mcdropout(ngam, x, forward_passes = 300, verbose = 0)

Arguments

ngam

Fitted neuralGAM object.

x

New data frame of covariates.

forward_passes

Number of MC Dropout passes (default 300).

verbose

Verbosity (0/1).

Details

Steps:

Parametric part: mean + variance from linear model.
Nonparametric part: pass-level sums across all smooths.
Joint across-pass variance captures covariance between smooths.
Combined with parametric variance (assumed independent).

Steps:

Parametric part: mean + variance from linear model.
Nonparametric part: pass-level sums across all smooths.
Joint across-pass variance captures covariance between smooths.
Combined with parametric variance (assumed independent).

Value

A numeric vector of length nrow(x) giving epistemic SEs on the link scale.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Internal helper: MC Dropout forward sampling

Description

Run passes stochastic forward passes with Dropout active at prediction time. Each pass samples a dropout mask and produces predictions, simulating epistemic uncertainty.

Usage

.mc_dropout_forward(model, x, passes, output_dim)

Arguments

model

Fitted Keras model for one smooth term.

x

Input matrix (converted to TensorFlow tensor internally).

passes

Number of stochastic passes (>=2).

output_dim

Expected number of outputs per observation (e.g., 1 = mean only, 3 = quantile heads (lwr, upr, mean)).

Value

A numeric array of shape [passes, n_obs, output_dim].

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Autoplot method for `neuralGAM` objects (epistemic-only)

Description

Produce effect/diagnostic plots from a fitted neuralGAM model. Supported panels:

which = "response": fitted response vs. index, with optional epistemic confidence intervals (CI).
which = "link": linear predictor (link scale) vs. index, with optional CI.
which = "terms": single per-term contribution g_j(x_j) on the link scale, with optional CI band for the smooth (epistemic).

Usage

## S3 method for class 'neuralGAM'
autoplot(
  object,
  newdata = NULL,
  which = c("response", "link", "terms"),
  interval = c("none", "confidence"),
  level = 0.95,
  forward_passes = 150,
  term = NULL,
  rug = TRUE,
  ...
)

Arguments

object

A fitted neuralGAM object.

newdata

Optional data.frame/list of covariates. If omitted, training data are used.

which

One of c("response","link","terms"). Default "response".

interval

One of c("none","confidence"). Default "confidence".

level

Coverage level for confidence intervals (e.g., 0.95). Default 0.95.

forward_passes

Integer. Number of MC-dropout forward passes used when uncertainty_method %in% c("epistemic","both").

term

Single term name to plot when which = "terms".

rug

Logical; if TRUE (default), add rugs to continuous term plots.

...

Additional arguments passed to predict.neuralGAM.

Details

Uncertainty semantics (epistemic only)

CI: Uncertainty about the fitted mean.
For the response, SEs are mapped via the delta method;
For terms, bands are obtained as \hat g_j \pm z \cdot SE(\hat g_j) on the link scale.

Value

A single ggplot object.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Examples

## Not run: 

library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test  <- dat$test

ngam <- neuralGAM(
  y ~ s(x1) + x2 + s(x3),
  data = train, family = "gaussian", num_units = 128,
  uncertainty_method = "epistemic", forward_passes = 10
)
## --- Autoplot (epistemic-only) ---
# Per-term effect with CI band
autoplot(ngam, which = "terms", term = "x1", interval = "confidence")  +
  ggplot2::xlab("x1") + ggplot2::ylab("Partial effect")

# Request a different number of forward passes or CI level:
autoplot(ngam, which = "terms", term = "x1", interval = "confidence",
forward_passes = 15, level = 0.7)
# Response panel
autoplot(ngam, which = "response")

# Link panel with custom title
autoplot(ngam, which = "link")  +
  ggplot2::ggtitle("Main Title")


## End(Not run)

Build and compile a neural network feature model

Description

Builds and compiles a keras neural network for a single smooth term in a neuralGAM model.

The network can optionally be configured to output symmetric prediction intervals (lower bound, upper bound, and mean prediction) using a custom quantile loss (make_quantile_loss()), or a standard single-output point prediction using any user-specified loss function.

When uncertainty_method is aleatoric or both the model outputs three units corresponding to the lower bound, upper bound, and mean prediction, and is compiled with make_quantile_loss(alpha, mean_loss, ...). In any other case, the model outputs a single unit (point prediction) and uses the loss function provided in loss.

Usage

build_feature_NN(
  num_units,
  learning_rate = 0.001,
  activation = "relu",
  kernel_initializer = "glorot_normal",
  kernel_regularizer = NULL,
  bias_regularizer = NULL,
  bias_initializer = "zeros",
  activity_regularizer = NULL,
  loss = "mse",
  name = NULL,
  alpha = 0.05,
  w_mean = 0.1,
  order_penalty_lambda = 0,
  uncertainty_method = "none",
  dropout_rate = 0.1,
  seed = NULL,
  ...
)

Arguments

num_units

Integer or vector of integers. Number of units in the hidden layer(s). If a vector is provided, multiple dense layers are added sequentially.

learning_rate

Numeric. Learning rate for the Adam optimizer.

activation

Character string or function. Activation function to use in hidden layers. If character, it must be valid for tf$keras$activations$get().

kernel_initializer

Keras initializer object or string. Kernel initializer for dense layers.

kernel_regularizer

Optional Keras regularizer for kernel weights.

bias_regularizer

Optional Keras regularizer for bias terms.

bias_initializer

Keras initializer object or string. Initializer for bias terms.

activity_regularizer

Optional Keras regularizer for layer activations.

loss

Loss function to use.

When uncertainty_method is aleatoric or both, this is the mean-head loss inside make_quantile_loss() and can be any keras built-in loss name (e.g., "mse", "mae", "huber", "logcosh", ...) or a custom function.
In any other case, this is used directly in compile().

name

Optional character string. Name assigned to the model.

alpha

Numeric. Desired significance level for symmetric prediction intervals. Defaults to 0.05 (i.e., 95% PI using quantiles alpha/2 and 1-alpha/2).

w_mean

Non-negative numeric. Weight for the mean-head loss within the composite PI loss.

order_penalty_lambda

Non-negative numeric. Strength of a soft monotonicity penalty ReLU(lwr - upr) to discourage interval inversions.

uncertainty_method

Character string indicating the type of uncertainty to estimate in prediction intervals. Must be one of "none", "aleatoric", "epistemic", or "both".

dropout_rate

Numeric in (0,1). Dropout rate used when uncertainty_method %in% c("epistemic","both").

seed

Random seed.

...

Arguments passed on to neuralGAM

formula: Model formula. Smooth terms must be wrapped in s(...). You can specify per-term NN settings, e.g.: y ~ s(x1, num_units = 1024) + s(x3, num_units = c(1024, 512)).
data: Data frame containing the variables.
family: Response distribution: "gaussian", "binomial", "poisson".
kernel_initializer,bias_initializer: Initializers for weights and biases.
kernel_regularizer,bias_regularizer,activity_regularizer: Optional Keras regularizers.
forward_passes: Integer. Number of MC-dropout forward passes used when uncertainty_method %in% c("epistemic","both").
validation_split: Optional fraction of training data used for validation.
w_train: Optional training weights.
bf_threshold: Convergence criterion of the backfitting algorithm. Defaults to 0.001
ls_threshold: Convergence criterion of the local scoring algorithm. Defaults to 0.1
max_iter_backfitting: An integer with the maximum number of iterations of the backfitting algorithm. Defaults to 10.
max_iter_ls: An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to 10.
verbose: Verbosity: 0 silent, 1 progress messages.

Details

Prediction interval mode (uncertainty_method %in% c("aleatoric", "both")):

Output layer has 3 units:
- lwr: lower bound, \tau = \alpha/2
- upr: upper bound, \tau = 1 - \alpha/2
- y_hat: mean prediction
Loss function is make_quantile_loss() which combines two pinball losses (for lower and upper quantiles) with the chosen mean prediction loss and an optional non-crossing penalty.

Point prediction mode (uncertainty_method %in% c("none", "epistemic")):

Output layer has 1 unit: point prediction only.
Loss function is the one passed in loss.

Value

A compiled keras_model object ready for training.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

References

Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv:1412.6980. Koenker, R., & Bassett Jr, G. (1978). Regression quantiles. Econometrica, 46(1), 33-50.

Deviance of the model

Description

Computes the deviance of the model according to the distribution family specified in the "family" parameter.

Usage

dev(muhat, y, family, w = NULL)

Arguments

muhat

current estimation of the response variable

y

response variable

family

A description of the link function used in the model: "gaussian", "poisson", or "binomial"

w

weight assigned to each observation. Defaults to 1.

Value

the deviance of the model

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Diagnosis plots to evaluate a fitted `neuralGAM` model.

Description

Produce a 2x2 diagnostic panel for a fitted neuralGAM model, mirroring the layout of gratia's appraise() for mgcv GAMs: (top-left) a QQ plot of residuals with optional simulation envelope, (top-right) a histogram of residuals, (bottom-left) residuals vs linear predictor \eta, and (bottom-right) observed vs fitted values on the response scale.

Usage

diagnose(
  object,
  data = NULL,
  response = NULL,
  qq_method = c("uniform", "simulate", "normal"),
  n_uniform = 1000,
  n_simulate = 200,
  residual_type = c("deviance", "pearson", "quantile"),
  level = 0.95,
  point_col = "steelblue",
  point_alpha = 0.5,
  hist_bins = 30
)

Arguments

object

A fitted neuralGAM model.

data

Optional data.frame for out-of-sample evaluation. If supplied, response must name the response column.

response

Character scalar giving the response variable name in data (required when data is provided).

qq_method

Character; one of "uniform", "simulate", or "normal" for the QQ reference. See Details.

n_uniform

Integer; number of U(0,1) replicates for qq_method = "uniform".

n_simulate

Integer; number of simulated datasets for qq_method = "simulate" (also controls the QQ bands).

residual_type

One of "deviance", "pearson", or "quantile". Quantile (Dunn-Smyth) residuals are recommended for discrete families (binomial/poisson) because they are continuous and approximately standard normal under the fitted model, improving QQ diagnostics.

level

Numeric in (0,1); coverage level for the QQ bands when qq_method = "simulate".

point_col

Character; colour for points in scatter/histogram panels.

point_alpha

Numeric in (0,1); point transparency.

hist_bins

Integer; number of bins in the histogram.

Details

The function uses predict.neuralGAM() to obtain the linear predictor (type = "link") and the fitted mean on the response scale (type = "response"). Residuals are computed internally for supported families; by default we use deviance residuals:

Gaussian: r_i = y_i - \hat{\mu}_i.
Binomial: r_i = \mathrm{sign}(y_i-\hat{\mu}_i)\, \sqrt{2 w_i \{ y_i \log(y_i/\hat{\mu}_i) + (1-y_i)\log[(1-y_i)/(1-\hat{\mu}_i)] \}}, with optional per-observation weights w_i (e.g., trials for proportions).
Poisson: r_i = \mathrm{sign}(y_i-\hat{\mu}_i)\, \sqrt{2 w_i \{ y_i \log(y_i/\hat{\mu}_i) - (y_i-\hat{\mu}_i) \}}, adopting the convention y_i \log(y_i/\hat{\mu}_i)=0 when y_i=0.

For Gaussian models, these plots diagnose symmetry, tail behaviour, and mean/variance misfit similar to standard GLM/GAM diagnostics. For non-Gaussian families (Binomial, Poisson), interpret shapes on the deviance scale, which is approximately normal under a well-specified model. For discrete data, randomized quantile (Dunn-Smyth) residuals are also available and often yield smoother QQ behaviour.

QQ reference methods. qq_method controls how theoretical quantiles are generated (as in gratia):

"uniform" (default): draw U(0,1) and map through the inverse CDF of the fitted response distribution at each observation; convert to residuals and average the sorted curves over n_uniform draws. Fast and respects the mean-variance relationship.
"simulate": simulate n_simulate datasets from the fitted model at the observed covariates, compute residuals, and average the sorted curves; also provides pointwise level bands on the QQ plot.
"normal": use standard normal quantiles; a fallback when a suitable RNG or inverse CDF is unavailable.

For Poisson models, include offsets for exposure in the linear predictor (e.g., log(E)). The QQ methods use \hat{\mu}_i with qpois/rpois for "uniform"/"simulate", respectively.

Value

A patchwork object combining four ggplot2 plots. You can print it, add titles/themes, or extract individual panels if needed.

Dependencies

Requires ggplot2 and patchwork.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

References

Augustin, N.H., Sauleau, E.A., Wood, S.N. (2012). On quantile-quantile plots for generalized linear models. Computational Statistics & Data Analysis, 56, 2404-2409. https://doi.org/10.1016/j.csda.2012.01.026

Dunn, P.K., Smyth, G.K. (1996). Randomized quantile residuals. Journal of Computational and Graphical Statistics, 5(3), 236-244.

Derivative of the link function

Description

Computes the derivative of the link function according to the distribution family specified in the "family" parameter.

Usage

diriv(family, muhat)

Arguments

family

A description of the link function used in the model: "gaussian", "poisson", or "binomial"

muhat

fitted values

Value

derivative of the link function for the fitted values

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Extract structured elements from a model formula

Description

Parses a GAM-style formula and separates the response, all terms, smooth (non‑parametric) terms declared via s(...), and parametric terms. In addition, it extracts per‑term neural network specifications that can be written inline inside each s(...) call (e.g., num_units, activation, kernel_initializer, bias_initializer, kernel_regularizer, bias_regularizer, activity_regularizer).

This function uses an abstract syntax tree (AST) walker (no regex) to read the arguments of each s(var, ...). Arguments are evaluated in the caller’s environment. For ⁠*_regularizer⁠ and ⁠*_initializer⁠ arguments, proper Keras objects are required (e.g., keras::regularizer_l2(1e-4), keras::initializer_glorot_uniform()).

Usage

get_formula_elements(formula)

Arguments

formula

A model formula. Smooth terms must be written as s(var, ...).

Details

Inline per‑term configuration in s(...). You can specify neural network hyperparameters per smooth term, e.g.:

  y ~ s(x1, num_units = c(1024, 512),
          activation = "tanh",
          kernel_regularizer = keras::regularizer_l2(1e-4)) +
      x2 +
      s(x3, num_units = 1024,
          bias_initializer = keras::initializer_zeros())

Values are evaluated in the caller’s environment. For regularizers and initializers you must pass actual Keras objects (not character strings).

Supported keys. Only the keys listed in np_architecture above are recognized. Unknown keys are ignored with a warning.

Typical usage. The returned np_terms and np_architecture are consumed by model-building code to construct one neural network per smooth term, applying any per‑term overrides while falling back to global defaults for unspecified keys.

Value

A list with the following elements:

y: Character scalar. The response variable name.
terms: Character vector with all variable names on the RHS (both smooth and parametric).
np_terms: Character vector with smooth (non‑parametric) variable names extracted from s(...).
p_terms: Character vector with parametric terms (i.e., ⁠terms \\ np_terms⁠).
np_formula: A formula containing only the s(...) terms (or NULL if none).
p_formula: A formula containing only the parametric terms (or NULL if none).
np_architecture: Named list keyed by smooth term (e.g., "x1", "x3"). Each entry is a list of per‑term settings parsed from s(term, ...). Supported keys include: num_units (numeric or numeric vector), activation (character), learning_rate (numeric), kernel_initializer (keras initializer object), bias_initializer (keras initializer object), kernel_regularizer (keras regularizer object), bias_regularizer (keras regularizer object), activity_regularizer (keras regularizer object).
formula: The original formula object.

Errors and validation

The first argument of each s(...) must be a symbol naming the variable.
All additional arguments to s(...) must be named.
*_regularizer must be a Keras regularizer object.
*_initializer must be a Keras initializer object.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Install neuralGAM python requirements

Description

Creates a conda environment (installing miniconda if required) and set ups the Python requirements to run neuralGAM (Tensorflow and Keras).

Miniconda and related environments are generated in the user's cache directory given by:

tools::R_user_dir('neuralGAM', 'cache')

Usage

install_neuralGAM()

Inverse of the link functions

Description

Computes the inverse of the link function according to the distribution family specified in the "family" parameter.

Usage

inv_link(family, muhat)

Arguments

family

A description of the link function used in the model: "gaussian", "poisson" or "binomial"

muhat

fitted values

Value

the inverse link function specified by the "family" distribution for the given fitted values

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Link function

Description

Applies the link function according to the distribution family specified in the "family" parameter.

Usage

link(family, muhat)

Arguments

family

A description of the link function used in the model: "gaussian", "poisson", or "binomial"

muhat

fitted values

Value

the link function specified by the "family" distribution for the given fitted values

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Ines Ortega-Fernandez, Marta Sestelo

Derivative of the Inverse Link Function

Description

Computes the derivative of the inverse link function d\mu/d\eta for common distributtion families supported by neuralGAM ("gaussian", "binomial", "poisson"). This quantity is required when applying the delta method to obtain standard errors on the response scale in predict().

Usage

mu_eta(family, eta)

Arguments

family

A character string specifying the distribution family: one of "gaussian", "binomial", or "poisson".

eta

Numeric vector of linear predictor values.

Details

For a neuralGAM with linear predictor \eta and mean response \mu:

\mu = g^{-1}(\eta),

the derivative d\mu/d\eta depends on the family:

Gaussian (identity link): d\mu/d\eta = 1.
Binomial (logit link): d\mu/d\eta = \mu (1-\mu).
Poisson (log link): d\mu/d\eta = \mu.

Internally, values of \eta are clamped to avoid numerical overflow/underflow in exp() and \mu is constrained away from 0 and 1 for stability.

Value

A numeric vector of the same length as eta, containing the derivative d\mu/d\eta.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Fit a neuralGAM model

Description

Fits a Generalized Additive Model where smooth terms are modeled by keras neural networks. In addition to point predictions, the model can optionally estimate uncertainty bands via Monte Carlo Dropout across forward passes.

Usage

neuralGAM(
  formula,
  data,
  family = "gaussian",
  num_units = 64,
  learning_rate = 0.001,
  activation = "relu",
  kernel_initializer = "glorot_normal",
  kernel_regularizer = NULL,
  bias_regularizer = NULL,
  bias_initializer = "zeros",
  activity_regularizer = NULL,
  loss = "mse",
  uncertainty_method = c("none", "epistemic"),
  alpha = 0.05,
  forward_passes = 100,
  dropout_rate = 0.1,
  validation_split = NULL,
  w_train = NULL,
  bf_threshold = 0.001,
  ls_threshold = 0.1,
  max_iter_backfitting = 10,
  max_iter_ls = 10,
  seed = NULL,
  verbose = 1,
  ...
)

Arguments

formula

Model formula. Smooth terms must be wrapped in s(...). You can specify per-term NN settings, e.g.: y ~ s(x1, num_units = 1024) + s(x3, num_units = c(1024, 512)).

data

Data frame containing the variables.

family

Response distribution: "gaussian", "binomial", "poisson".

num_units

Default hidden layer sizes for smooth terms (integer or vector). Mandatory unless every s(...) specifies its own num_units.

learning_rate

Learning rate for Adam optimizer.

activation

Activation function for hidden layers. Either a string understood by tf$keras$activations$get() or a function.

kernel_initializer, bias_initializer

Initializers for weights and biases.

kernel_regularizer, bias_regularizer, activity_regularizer

Optional Keras regularizers.

loss

Loss function to use. Can be any Keras built-in (e.g., "mse", "mae", "huber", "logcosh") or a custom function, passed directly to keras::compile().

uncertainty_method

Character string indicating the type of uncertainty to estimate. One of:

"none" (default): no uncertainty estimation.
"epistemic": MC Dropout for mean uncertainty (CIs)

alpha

Significance level for prediction intervals, e.g. 0.05 for 95% coverage.

forward_passes

Integer. Number of MC-dropout forward passes used when uncertainty_method %in% c("epistemic","both").

dropout_rate

Dropout probability in smooth-term NNs (0,1).

During training: acts as a regularizer.
During prediction (if uncertainty_method is "epistemic"): enables MC Dropout sampling.

validation_split

Optional fraction of training data used for validation.

w_train

Optional training weights.

bf_threshold

Convergence criterion of the backfitting algorithm. Defaults to 0.001

ls_threshold

Convergence criterion of the local scoring algorithm. Defaults to 0.1

max_iter_backfitting

An integer with the maximum number of iterations of the backfitting algorithm. Defaults to 10.

max_iter_ls

An integer with the maximum number of iterations of the local scoring Algorithm. Defaults to 10.

seed

Random seed.

verbose

Verbosity: 0 silent, 1 progress messages.

...

Additional arguments passed to keras::optimizer_adam().

Value

An object of class "neuralGAM", a list with elements including:

muhat: Numeric vector of fitted mean predictions (training data).
partial: Data frame of partial contributions g_j(x_j) per smooth term.
y: Observed response values.
eta: Linear predictor \eta = \eta_0 + \sum_j g_j(x_j).
lwr,upr: Lower/upper confidence interval bounds (response scale)
x: Training covariates (inputs).
model: List of fitted Keras models, one per smooth term (+ "linear" if present).
eta0: Intercept estimate \eta_0.
family: Model family.
stats: Data frame of training/validation losses per backfitting iteration.
mse: Training mean squared error.
formula: Parsed model formula (via get_formula_elements()).
history: List of Keras training histories per term.
globals: Global hyperparameter defaults.
alpha: PI significance level (if trained with uncertainty).
build_pi: Logical; whether the model was trained with uancertainty estimation enabled
uncertainty_method: Type of predictive uncertainty used ("none","epistemic").
var_epistemic: Matrix of per-term epistemic variances (if computed).

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Examples

## Not run: 

library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test  <- dat$test

# Per-term architecture and confidence intervals
ngam <- neuralGAM(
  y ~ s(x1, num_units = c(128,64), activation = "tanh") +
       s(x2, num_units = 256),
  data = train,
  uncertainty_method = "epistemic",
  forward_passes = 10,
  alpha = 0.05
)
ngam

## End(Not run)

Visualization of `neuralGAM` object with base graphics

Description

Visualization of a fitted neuralGAM. Plots learned partial effects, either as scatter/line plots for continuous covariates or s for factor covariates. Confidence and/or prediction intervals can be added if available.

Usage

## S3 method for class 'neuralGAM'
plot(
  x,
  select = NULL,
  xlab = NULL,
  ylab = NULL,
  interval = c("none", "confidence", "prediction", "both"),
  level = 0.95,
  ...
)

Arguments

x

A fitted neuralGAM object as produced by neuralGAM().

select

Character vector of terms to plot. If NULL (default), all terms are plotted.

xlab

Optional custom x-axis label(s).

ylab

Optional custom y-axis label(s).

interval

One of c("none","confidence","prediction","both"). Default "none". Controls whether intervals are plotted.

level

Coverage level for intervals (e.g. 0.95). Default 0.95.

...

Additional graphical arguments passed to plot().

Value

Produces plots on the current graphics device.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Plot training loss history for a neuralGAM model

Description

This function visualizes the training and/or validation loss at the end of each backfitting iteration for each term-specific model in a fitted neuralGAM object. It is designed to work with the history component of a trained neuralGAM model.

Usage

plot_history(model, select = NULL, metric = c("loss", "val_loss"))

Arguments

model

A fitted neuralGAM model.

select

Optional character vector of term names (e.g. "x1" or c("x1", "x3")) to subset the history. If NULL (default), all terms are included.

metric

Character vector indicating which loss metric(s) to plot. Options are "loss", "val_loss", or both. Defaults to both.

Value

A ggplot object showing the loss curves by backfitting iteration, with facets per term.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Examples

## Not run: 
  set.seed(123)
  n <- 200
  x1 <- runif(n, -2, 2)
  x2 <- runif(n, -2, 2)
  y <- 2 + x1^2 + sin(x2) + rnorm(n, 0, 0.1)
  df <- data.frame(x1 = x1, x2 = x2, y = y)

  model <- neuralGAM::neuralGAM(
    y ~ s(x1) + s(x2),
    data = df,
    num_units = 8,
    family = "gaussian",
    max_iter_backfitting = 2,
    max_iter_ls = 1,
    learning_rate = 0.01,
    seed = 42,
    validation_split = 0.2,
    verbose = 0
  )

  plot_history(model)                      # Plot all terms
  plot_history(model, select = "x1")       # Plot just x1
  plot_history(model, metric = "val_loss") # Plot only validation loss

## End(Not run)

Produces predictions from a fitted `neuralGAM` object

Description

Generate predictions from a fitted neuralGAM model. Supported types:

type = "link" (default): linear predictor on the link scale.
type = "response": predictions on the response scale.
type = "terms": per-term contributions to the linear predictor (no intercept).

Uncertainty estimation via MC Dropout (epistemic only)

If se.fit = TRUE, standard errors (SE) of the fitted mean are returned (mgcv-style via Monte Carlo Dropout).
For type = "response", SEs are mapped to the response scale by the delta method: se_\mu = |d\mu/d\eta| \cdot se_\eta.
interval = "confidence" returns CI bands derived from SEs; prediction intervals are not supported.
For type = "terms", interval="confidence" returns per-term CI matrices (and se.fit when requested).

Details

Epistemic SEs (CIs) are obtained via Monte Carlo Dropout. When type != "terms" and SEs/CIs are requested in the presence of smooth terms, uncertainty is aggregated jointly to capture cross-term covariance in a single MC pass set. Otherwise, per-term variances are used (parametric variances are obtained from stats::predict(..., se.fit=TRUE)).
For type="terms", epistemic SEs and CI matrices are returned when requested.
PIs are not defined on the link scale and are not supported.

Usage

## S3 method for class 'neuralGAM'
predict(
  object,
  newdata = NULL,
  type = c("link", "response", "terms"),
  terms = NULL,
  se.fit = FALSE,
  interval = c("none", "confidence"),
  level = 0.95,
  forward_passes = 150,
  verbose = 1,
  ...
)

Arguments

object

A fitted neuralGAM object.

newdata

Optional data.frame/list of covariates at which to predict. If omitted, the training data cached in the object are used.

type

One of c("link","response","terms"). Default "link".

terms

If type = "terms", character vector of term names to include. If NULL, all terms are returned. Intercept is not included (as in mgcv).

se.fit

Logical; if TRUE, return SEs of the fitted mean (epistemic). Default FALSE. For type="terms", returns a matrix of per-term SEs when available.

interval

One of c("none","confidence") (default "none"). For type="terms", setting interval="confidence" returns per-term CI matrices.

level

Coverage level for confidence intervals (e.g., 0.95). Default 0.95.

forward_passes

Integer; number of MC-dropout forward passes when computing epistemic uncertainty.

verbose

Integer (0/1). Default 1.

...

Other options (passed on to internal predictors).

Value

type="terms":
- interval="none": matrix of per-term contributions; if se.fit=TRUE, a list with $fit, $se.fit.
- interval="confidence": a list with matrices $fit, $se.fit, $lwr, $upr.
type="link" or type="response":
- interval="none": vector (or list with $fit, $se.fit if se.fit=TRUE).
- interval="confidence": data.frame with fit, lwr, upr.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Examples

## Not run: 

library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test  <- dat$test

ngam0 <- neuralGAM(
  y ~ s(x1) + x2 + s(x3),
  data = train, family = "gaussian",
  num_units = 128, uncertainty_method = "epistemic"
)
link_ci  <- predict(ngam0, type = "link", interval = "confidence",
                    level = 0.95, forward_passes = 10)
resp_ci  <- predict(ngam0, type = "response", interval = "confidence",
                    level = 0.95, forward_passes = 10)
trm_se   <- predict(ngam0, type = "terms",
                    se.fit = TRUE, forward_passes = 10)

## End(Not run)

Short `neuralGAM` summary

Description

Default print method for a neuralGAM object.

Usage

## S3 method for class 'neuralGAM'
print(x, ...)

Arguments

x

A neuralGAM object.

...

Additional arguments (currently unused).

Value

Prints a brief summary of the fitted model including:

Distribution family: The distribution family used ("gaussian", "binomial", or "poisson").
Formula: The model formula.
Intercept value: The fitted intercept (\eta_0).
Mean Squared Error (MSE): The training MSE of the model.
Training sample size: The number of observations used to train the model.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Examples

## Not run: 


library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test  <- dat$test

ngam <- neuralGAM(
  y ~ s(x1) + x2 + s(x3),
  data = train,
  num_units = 128,
  family = "gaussian",
  activation = "relu",
  learning_rate = 0.001,
  bf_threshold = 0.001,
  max_iter_backfitting = 10,
  max_iter_ls = 10,
  seed = 1234
)
print(ngam)

## End(Not run)

Objects exported from other packages

Description

These objects are imported from other packages. Follow the links below to see their documentation.

ggplot2: autoplot

Simulate Example Data for NeuralGAM

Description

Generate a synthetic dataset for demonstrating and testing neuralGAM. The response is constructed from three covariates: a quadratic effect, a linear effect, and a sinusoidal effect, plus Gaussian noise.

Usage

sim_neuralGAM_data(n = 2000, seed = 42, test_prop = 0.3)

Arguments

n

Integer. Number of observations to generate. Default 2000.

seed

Integer. Random seed for reproducibility. Default 42.

test_prop

Numeric in [0,1]. Proportion of data to reserve for the test set. Default 0.3.

Details

The data generating process is:

y = 2 + x1^2 + 2 x2 + \sin(x3) + \varepsilon,

where \varepsilon \sim N(0, 0.25^2).

Covariates x1, x2, x3 are drawn independently from U(-2.5, 2.5).

Value

A list with two elements:

train: data.frame with training data.
test: data.frame with test data.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Examples

## Not run: 
set.seed(123)
dat <- sim_neuralGAM_data(n = 500, test_prop = 0.2)

train <- dat$train
test  <- dat$test


## End(Not run)

Summary of a `neuralGAM` model

Description

Summarizes a fitted neuralGAM object: family, formula, sample size, intercept, training MSE, per-term neural net settings, per-term NN layer configuration, and training history. If a linear component is present, its coefficients are also reported.

Usage

## S3 method for class 'neuralGAM'
summary(object, ...)

Arguments

object

A neuralGAM object.

...

Additional arguments (currently unused).

Value

Invisibly returns object. Prints a human-readable summary.

Author(s)

Ines Ortega-Fernandez, Marta Sestelo

Examples

## Not run: 

library(neuralGAM)
dat <- sim_neuralGAM_data()
train <- dat$train
test  <- dat$test

ngam <- neuralGAM(
  y ~ s(x1) + x2 + s(x3),
  data = train,
  num_units = 128,
  family = "gaussian",
  activation = "relu",
  learning_rate = 0.001,
  bf_threshold = 0.001,
  max_iter_backfitting = 10,
  max_iter_ls = 10,
  seed = 1234
)
summary(ngam)

## End(Not run)

Validate/resolve a Keras activation

Description

Validate/resolve a Keras activation

Usage

validate_activation(activation)

Arguments

activation

character or function. If character, must be a valid tf.keras activation identifier (e.g., "relu", "gelu", "swish", ...).

Value

a callable activation (Python callable) or the original R function.

Examples

## Not run: 
library(neuralGAM)
act <- neuralGAM:::validate_activation("relu")     # ok
act <- neuralGAM:::validate_activation(function(x) x)  # custom

## End(Not run)

Validate/resolve a Keras loss

Description

Validate/resolve a Keras loss

Usage

validate_loss(loss)

Arguments

loss

character or function. If character, must be a valid tf.keras loss identifier (e.g., "mse", "mae", "huber", "logcosh", ...).

Value

a callable loss (Python callable) or the original R function.

Examples

## Not run: 
library(neuralGAM)
L <- neuralGAM:::validate_loss("huber")             # ok (Huber with default delta)
L <- neuralGAM:::validate_loss(function(y,t) tensorflow::tf$reduce_mean((y-t)^2))  # custom

## End(Not run)

Weights

Description

Computes the weights for the Local Scoring Algorithm.

Usage

weight(w, muhat, family)

Arguments

w

weights

muhat

fitted values

family

A description of the link function used in the model: "gaussian", "binomial", or "poisson".

Value

computed weights for the Local Scoring algorithm according to the "family" distribution

Author(s)

Ines Ortega-Fernandez, Marta Sestelo.

Ines Ortega-Fernandez, Marta Sestelo

neuralGAM: Interpretable Neural Network Based on Generalized Additive Models

Description

Author(s)

See Also

Internal helper: combine epistemic and aleatoric uncertainties via mixture sampling

Description

Usage

Arguments

Value

Author(s)

Internal helper: combine epistemic and aleatoric via variance decomposition

Description

Usage

Arguments

Value

Author(s)

Internal helper: compute uncertainty decomposition (epistemic / aleatoric / both)

Description

Usage

Arguments

Value

Author(s)

Internal helper: joint predictive interval (both) via variance combiner

Description

Usage

Internal helper: joint epistemic SE on link scale

Description

Usage

Arguments

Details

Value

Author(s)

Internal helper: MC Dropout forward sampling

Description

Usage

Arguments

Value

Author(s)

Autoplot method for neuralGAM objects (epistemic-only)

Description

Usage

Arguments

Details

Value

Author(s)

Examples

Build and compile a neural network feature model

Description

Usage

Arguments

Details

Value

Author(s)

References

Deviance of the model

Description

Usage

Arguments

Value

Author(s)

Diagnosis plots to evaluate a fitted neuralGAM model.

Description

Usage

Arguments

Details

Value

Dependencies

Author(s)

References

Derivative of the link function

Description

Usage

Arguments

Value

Author(s)

Extract structured elements from a model formula

Description

Usage

Arguments

Details

Autoplot method for `neuralGAM` objects (epistemic-only)

Diagnosis plots to evaluate a fitted `neuralGAM` model.

Visualization of `neuralGAM` object with base graphics

Produces predictions from a fitted `neuralGAM` object

Short `neuralGAM` summary