Generate lagged variables for one or more lags

Description

This function creates lagged versions of one or more numeric or categorical variables in an equally spaced time-series data set. A single call can create multiple lags for each selected variable and, optionally, for each spatial/grouping unit.

lag_cov(data, name, time, lag, group = NULL, add = TRUE)

Arguments

data: A data.frame containing equally spaced observations.
name: A character vector: name of the variable (or variables) to lag.
time: A single character string: name of the time-index variable (e.g., "date").
lag: A numeric vector of one or more positive integers. Each value is interpreted as a ‘lag’ (i.e. shift the series backward by k observations).
group: Optional character vector naming column(s) that define independent time-series (e.g. regions). If NULL, the whole data set is treated as one series.
add: Logical. If TRUE (default) the lagged columns are appended to data; if FALSE the function returns only the lagged columns as a matrix.

Returns

Either a data frame (when add = TRUE) containing the original data plus the new lagged columns, or a numeric matrix of lagged values (when add = FALSE).

Examples

## Daily series for two micro-regions
d <- data.frame(
  date       = as.Date("2023-01-01") + 0:9,
  micro_code = rep(c("A", "B"), each = 5),
  tmin       = rnorm(10, 10, 2),
  pdsi       = rnorm(10)
)

## Create lags 1 to 3 for tmin and pdsi
lagged <- lag_cov(
  data  = d,
  name   = c("tmin", "pdsi"),
  time  = "date",
  group = "micro_code",
  lag   = c(1:3)
)

## Only lagged columns (matrix),
lag_only <- lag_cov(
  data = d, name = "tmin", time = "date",
  lag  = c(1:3), add = FALSE
)