Fast generalized linear model fitting

bigLm default

fastglm(x, ...)

# Default S3 method
fastglm(
  x,
  y,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  start = NULL,
  etastart = NULL,
  mustart = NULL,
  method = 0L,
  tol = 1e-08,
  maxit = 100L,
  ...
)

Arguments

x: input model matrix. Must be a matrix object
...: not used
y: numeric response vector of length nobs.
family: a description of the error distribution and link function to be used in the model. For fastglm this can be a character string naming a family function, a family function or the result of a call to a family function. For fastglmPure only the third option is supported. (See family for details of family functions.)
weights: an optional vector of 'prior weights' to be used in the fitting process. Should be a numeric vector.
offset: this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be a numeric vector of length equal to the number of cases
start: starting values for the parameters in the linear predictor.
etastart: starting values for the linear predictor.
mustart: values for the vector of means.
method: an integer scalar with value 0 for the column-pivoted QR decomposition, 1 for the unpivoted QR decomposition, 2 for the LLT Cholesky, or 3 for the LDLT Cholesky
tol: threshold tolerance for convergence. Should be a positive real number
maxit: maximum number of IRLS iterations. Should be an integer

Value

A list with the elements

coefficients: a vector of coefficients
se: a vector of the standard errors of the coefficient estimates
rank: a scalar denoting the computed rank of the model matrix
df.residual: a scalar denoting the degrees of freedom in the model
residuals: the vector of residuals
s: a numeric scalar - the root mean square for residuals
fitted.values: the vector of fitted values

Examples


x <- matrix(rnorm(10000 * 100), ncol = 100)
y <- 1 * (0.25 * x[,1] - 0.25 * x[,3] > rnorm(10000))

system.time(gl1 <- glm.fit(x, y, family = binomial()))
#>    user  system elapsed 
#>   0.204   0.004   0.208 

system.time(gf1 <- fastglm(x, y, family = binomial()))
#>    user  system elapsed 
#>   0.061   0.001   0.063 

system.time(gf2 <- fastglm(x, y, family = binomial(), method = 1))
#>    user  system elapsed 
#>   0.054   0.001   0.055 

system.time(gf3 <- fastglm(x, y, family = binomial(), method = 2))
#>    user  system elapsed 
#>   0.015   0.001   0.017 

system.time(gf4 <- fastglm(x, y, family = binomial(), method = 3))
#>    user  system elapsed 
#>   0.016   0.001   0.017 

max(abs(coef(gl1) - gf1$coef))
#> [1] 1.165734e-15
max(abs(coef(gl1) - gf2$coef))
#> [1] 1.498801e-15
max(abs(coef(gl1) - gf3$coef))
#> [1] 1.165734e-15
max(abs(coef(gl1) - gf4$coef))
#> [1] 1.110223e-15


if (FALSE) { # \dontrun{
nrows <- 50000
ncols <- 50
bkFile <- "bigmat2.bk"
descFile <- "bigmatk2.desc"
bigmat <- filebacked.big.matrix(nrow=nrows, ncol=ncols, type="double",
                                backingfile=bkFile, backingpath=".",
                                descriptorfile=descFile,
                                dimnames=c(NULL,NULL))
for (i in 1:ncols) bigmat[,i] = rnorm(nrows)*i
y <- 1*(rnorm(nrows) + bigmat[,1] > 0)

system.time(gfb1 <- fastglm(bigmat, y, family = binomial(), method = 3))
} # }

Fast generalized linear model fitting

Arguments

Value

See also

Examples