Fit Multiple INLA Models
Description
This function fits a set of INLA model formulas, provided in a GHRformulas object, to a specified dataset. For each fitted model, it extracts a range of outputs, including goodness-of-fit (GoF) metrics and other model outputs (fitted values, fixed effects, random effects). Results are extracted and stored in a GHRmodels object.
fit_models(
formulas,
data,
family,
name,
offset = NULL,
control_compute = list(config = FALSE, vcov = FALSE),
nthreads = 8,
pb = FALSE
)Arguments
formulas: AGHRformulasobject containing multiple INLA model formulas.data: A data frame containing the variables used in the model formulas.family: A character string specifying the likelihood family (e.g.,"poisson","nbinomial", etc.).name: A character string to label each fitted model (e.g.,"mod").offset: A character string specifying the name of the offset variable indata. IfNULL, no offset is applied. Default isNULL. Internally,log(offset_values)is applied.control_compute: A named list controlling additional computation options:config: Logical ; ifTRUE, stores the Gaussian Markov Random Field (GMRF) and enables the computation of posterior predictive distribution (1,000 draws). Defaults toFALSE.vcov: Logical ifTRUE, returns the variance-covariance (correlation) matrix of fixed effects. Defaults toFALSE.
nthreads: An integer specifying the number of threads for parallel computation. Default is8.pb: Logical; ifTRUE, displays a progress bar while fitting models. Default isFALSE.
Details
This function iterates over each formula in the GHRformulas object and fits the corresponding INLA model using the internal function .fit_single_model(). For each fitted model, it extracts the fitted values, fixed effects, and random effects summaries. Then, it calculates a series of model evaluation metrics using the .gof_single_model() internal function.
The goodness-of-fit (GoF) metrics are organized into two categories:
- Model-Specific Goodness-of-Fit Metrics
These are computed separately for each model:
Deviance Information Criterion (DIC)
DIC = \bar{D} + p_{D}where
\bar{D}is the posterior mean deviance andp_{D}is the effective number of parameters. Lower DIC values indicate a better model fit, balancing goodness-of-fit and model complexity.Watanabe-Akaike Information Criterion (WAIC)
WAIC = -2\left(\mathrm{lppd} - p_{\mathrm{WAIC}}\right)WAIC evaluates predictive accuracy and penalizes model complexity through the log pointwise predictive density (
\mathrm{lppd}). Lower values imply better generalization.Log Mean Score (LMS)
LMS = \frac{1}{n} \sum_{i=1}^n \left( -\log(\mathrm{CPO}_i) \right)LMS assesses the average negative log-predictive density using Conditional Predictive Ordinates (CPO). Lower LMS values indicate stronger predictive performance by penalizing models that assign low probability to observed outcomes.
Mean Absolute Error (MAE)
MAE = \frac{1}{n} \sum_{i=1}^n \left| y_i - \hat{y}_i \right|Measures the average absolute deviation between observed values
y_iand predicted values\hat{y}_i. Lower MAE values indicate improved fit. Ifconfig = TRUE, MAE is computed using the full posterior predictive distribution (PPD); otherwise, it uses point estimates from INLA’ssummary.fitted.values.Root Mean Squared Error (RMSE)
RMSE = \sqrt{ \frac{1}{n} \sum_{i=1}^n (y_i - \hat{y}_i)^2 }Captures average squared deviation between observed and predicted values. RMSE penalizes larger errors more heavily. Lower values reflect better model fit. If
config = TRUE, RMSE uses the PPD; otherwise, it uses point estimates.Continuous Ranked Probability Score (CRPS)
\mathrm{CRPS}(F, y) = \int_{-\infty}^{\infty} \left[F(t) - \mathbf{1}\{y \leq t\}\right]^2 dtCRPS assesses how well the predictive cumulative distribution aligns with the observed outcome. Lower scores suggest better calibrated predictive distributions. Only available when
config = TRUE.
- Model Comparison Metrics (relative to the first model)
The first model in the list is treated as the baseline for model comparisons. All other models are evaluated against it using the following metrics:
Difference in DIC and WAIC
Stored as
dic_vs_firstandwaic_vs_first. These represent how much higher (or lower) each model’s DIC/WAIC is compared to the first model. Additionally, 95% credible intervals for these differences are stored as*_vs_first_lciand*_vs_first_uci.Difference in MAE and RMSE
Stored as
mae_vs_firstandrmse_vs_first. These reflect the absolute difference in prediction error compared to the first model. No credible intervals are computed for these metrics.Continuous Ranked Probability Score Skill Score (CRPSS)
\mathrm{CRPSS} = 1 - \frac{\mathrm{CRPS}_{\text{model}}}{\mathrm{CRPS}_{\text{baseline}}}Indicates how much better the predictive distribution of the current model is relative to the baseline model. Values closer to 1 indicate improvement; negative values imply worse performance. Available only when
config = TRUE.Pseudo R-squared based on deviance
R^2 = 1 - \exp\left( \frac{-2}{n} \left( \frac{dev_{\text{model}}}{-2} - \frac{dev_{\text{base}}}{-2} \right) \right)Captures relative deviance reduction compared to the baseline model. Values range from 0 (no improvement) to 1 (strong improvement).
Random Effect Variance
\mathrm{Var}_{re} = \frac{1}{\mathrm{precision}}Quantifies residual variance due to group- or cluster-level effects. Computed only when random effects are defined in the model formula.
Proportional Change in Random Effect Variance
\frac{\mathrm{Var}_{re}}{\mathrm{Var}_{re}^{(1)}} - 1Represents the relative change in group-level variance compared to the baseline model. Helps assess how much variance is explained by added covariates.
Returns
An object of class GHRmodels containing:
$mod_gof: A data frame of model-specific goodness-of-fit metrics.$fitted: A list of fitted values (one element per model). Ifconfig = TRUE, these are derived from the posterior predictive distribution (PPD); otherwise, they are extracted from INLA’ssummary.fitted.values.$fixed: A list of summary tables for fixed effects (one element per model).$random: A list of summary tables for random effects (one element per model).$formulas: A character vector of the original model formulas used.$re: A character vector specifying any random effects defined informulas.$outcome: A character string indicating the outcome variable used.$data: The original data frame passed to the function.
See Also
as_GHRformulas converts a set of R-INLA-compatible formulas into a GHRformulas object.
Examples
## Not run:
# Load example dataset
data(dengueMS)
# Declare formulas
formulas <- c(
"dengue_cases ~ tmin + f(year, model='rw1')",
"dengue_cases ~ pdsi + f(year, model='rw1')"
)
# Tranform formulas into a 'GHRformulas' object
ghr_formulas <- as_GHRformulas(formulas)
# Fit multiple models
results <- fit_models(
formulas = ghr_formulas,
data = dengue_MS,
family = "nbinomial",
name = "TestModel",
offset = "population",
nthreads = 2,
control_compute = list(config = FALSE),
pb = TRUE
)
# Inspect goodness-of-fit metrics
head(results$mod_gof)
## End(Not run)