`fastglm_fit()` is a fitting method for [glm()]. It works like `glm.fit()`, i.e., by being supplied to the `method` argument of `glm()`.

fastglm_fit(
  x,
  y,
  weights = rep(1, NROW(y)),
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  offset = rep(0, NROW(y)),
  family = gaussian(),
  control = list(),
  intercept = TRUE,
  singular.ok = TRUE,
  firth = FALSE
)

fastglm_control(fastmethod = 0L, tol = 1e-07, maxit = 100L, ...)

# S3 method for class 'fastglmFit'
vcov(object, ...)

# S3 method for class 'fastglmFit'
summary(object, ...)

Arguments

x

a design matrix of dimension `n * p`. Can also be a `big.matrix` object from bigmemory.

y

a vector of observations of length `n`.

weights

an optional vector of 'prior weights' to be used in the fitting process. Should be `NULL` or a numeric vector.

start

optional starting values for the parameters in the linear predictor.

etastart

optional starting values for the linear predictor.

mustart

optional starting values for the vector of means.

offset

this can be used to specify an *a priori* known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases.

family

a description of the error distribution and link function to be used in the model. This must be a family function or the result of a call to a family function. (See [`family`] for details of family functions.)

control

a list of parameters for controlling the fitting process. This is passed to `fastglm_control()`.

singular.ok, intercept

See [glm.fit()].

firth

`logical`; if `TRUE` apply Firth's (1993) bias-reducing penalty to the score function. Currently supported only for `family = binomial(link = "logit")` on dense `x`. See `logistf::logistf()` for the canonical reference implementation.

fastmethod

`integer`; the method used for fitting. Allowable values include 0 for the column-pivoted QR decomposition, 1 for the unpivoted QR decomposition, 2 for the LLT Cholesky, 3 for the LDLT Cholesky, 4 for the full pivoted QR decomposition, and 5 for the Bidiagonal Divide and Conquer SVD. Default is 0. Can also be supplied as `method` when not supplied directly as an argument from `glm()` (see Examples).

tol

`numeric`; threshold tolerance for convergence.

maxit

`integer`; the maximum number of IRLS iterations.

...

for `vcov()` and `summary()`, other arguments passed downstream.

object

a `fastglmFit` object; the output of a call to `glm()` with `method = fastglm_fit`.

Details

The purpose of the functions documented on this page is to facilitate integration with existing [glm()] utilities in base R. `fastglm_fit()` is just a wrapper for [fastglmPure()] with some additional quality-of-life features. The `vcov()` and `summary()` methods use the unscaled coefficient covariance matrix returned directly from the C++ solver, so no refit is required.

Examples

set.seed(1234)
n <- 1e4
x <- matrix(rnorm(n * 25), ncol = 25)
eta <- 0.1 + 0.25 * x[,1] - 0.25 * x[,3] + 0.75 * x[,5] -0.35 * x[,6]
dat <- as.data.frame(x)

# binomial
dat$y <- rbinom(n, 1, pnorm(eta))

system.time({
    gl <- glm(y ~ ., data = dat,
              family = binomial)
})
#>    user  system elapsed 
#>   0.025   0.001   0.027 

system.time({
    gf0 <- glm(y ~ ., data = dat,
               family = binomial,
               method = fastglm_fit)
})
#>    user  system elapsed 
#>   0.009   0.000   0.010 

system.time({
    gf1 <- glm(y ~ ., data = dat,
               family = binomial,
               method = fastglm_fit,
               fastmethod = 1)
})
#>    user  system elapsed 
#>   0.009   0.001   0.009 

# poisson
dat$y <- rpois(n, eta^2)

system.time({
    gl <- glm(y ~ ., data = dat,
              family = poisson)
})
#>    user  system elapsed 
#>   0.035   0.002   0.037 

system.time({
    gf0 <- glm(y ~ ., data = dat,
               family = poisson,
               method = fastglm_fit)
})
#>    user  system elapsed 
#>   0.011   0.001   0.012 

system.time({
    gf1 <- glm(y ~ ., data = dat,
               family = poisson,
               method = fastglm_fit,
               fastmethod = 1)
})
#>    user  system elapsed 
#>   0.011   0.002   0.013 

# gamma
dat$y <- rgamma(n, exp(eta) * 1.75, 1.75)

system.time({
    gl <- glm(y ~ ., data = dat,
              family = Gamma(link = "log"))
})
#>    user  system elapsed 
#>   0.043   0.005   0.048 

system.time({
    gf0 <- glm(y ~ ., data = dat,
               family = Gamma(link = "log"),
               method = fastglm_fit)
})
#>    user  system elapsed 
#>   0.013   0.000   0.014 

system.time({
    gf1 <- glm(y ~ ., data = dat,
               family = Gamma(link = "log"),
               method = fastglm_fit,
               fastmethod = 1)
})
#>    user  system elapsed 
#>   0.013   0.001   0.014 

# Different (equivalent) ways of supplying
# control arguments:
gf1 <- glm(y ~ ., data = dat,
           family = Gamma(link = "log"),
           method = fastglm_fit,
           fastmethod = 1)

gf1 <- glm(y ~ ., data = dat,
           family = Gamma(link = "log"),
           method = fastglm_fit,
           control = list(fastmethod = 1))

gf1 <- glm(y ~ ., data = dat,
           family = Gamma(link = "log"),
           method = fastglm_fit,
           control = list(method = 1))