R/oem.R
Orthogonalizing EM
oem(x, y, family = c("gaussian", "binomial"), penalty = c("elastic.net", "lasso", "ols", "mcp", "scad", "mcp.net", "scad.net", "grp.lasso", "grp.lasso.net", "grp.mcp", "grp.scad", "grp.mcp.net", "grp.scad.net", "sparse.grp.lasso"), weights = numeric(0), lambda = numeric(0), nlambda = 100L, lambda.min.ratio = NULL, alpha = 1, gamma = 3, tau = 0.5, groups = numeric(0), penalty.factor = NULL, group.weights = NULL, standardize = TRUE, intercept = TRUE, maxit = 500L, tol = 1e07, irls.maxit = 100L, irls.tol = 0.001, accelerate = FALSE, ncores = 1, compute.loss = FALSE, hessian.type = c("upper.bound", "full"))
x  input matrix of dimension n x p or 

y  numeric response vector of length 
family 

penalty  Specification of penalty type. Choices include:
Careful consideration is required for the group lasso, group MCP, and group SCAD penalties. Groups as specified by the 
weights  observation weights. Not implemented yet. Defaults to 1 for each observation (setting weight vector to length 0 will default all weights to 1) 
lambda  A user supplied lambda sequence. By default, the program computes
its own lambda sequence based on 
nlambda  The number of lambda values. The default is 100. 
lambda.min.ratio  Smallest value for lambda, as a fraction of 
alpha  mixing value for 
gamma  tuning parameter for SCAD and MCP penalties. must be >= 1 
tau  mixing value for 
groups  A vector of describing the grouping of the coefficients. See the example below. All unpenalized variables should be put in group 0 
penalty.factor  Separate penalty factors can be applied to each coefficient. This is a number that multiplies lambda to allow differential shrinkage. Can be 0 for some variables, which implies no shrinkage, and that variable is always included in the model. Default is 1 for all variables. 
group.weights  penalty factors applied to each group for the group lasso. Similar to 
standardize  Logical flag for x variable standardization, prior to fitting the models.
The coefficients are always returned on the original scale. Default is 
intercept  Should intercept(s) be fitted ( 
maxit  integer. Maximum number of OEM iterations 
tol  convergence tolerance for OEM iterations 
irls.maxit  integer. Maximum number of IRLS iterations 
irls.tol  convergence tolerance for IRLS iterations. Only used if 
accelerate  boolean argument. Whether or not to use Nesterov acceleration with adaptive restarting 
ncores  Integer scalar that specifies the number of threads to be used 
compute.loss  should the loss be computed for each estimated tuning parameter? Defaults to 
hessian.type  only for logistic regression. if 
An object with S3 class "oem"
Shifeng Xiong, Bin Dai, Jared Huling, and Peter Z. G. Qian. Orthogonalizing EM: A designbased least squares algorithm. Technometrics, 58(3):285293, 2016. http://amstat.tandfonline.com/doi/abs/10.1080/00401706.2015.1054436
set.seed(123) n.obs < 1e4 n.vars < 50 true.beta < c(runif(15, 0.25, 0.25), rep(0, n.vars  15)) x < matrix(rnorm(n.obs * n.vars), n.obs, n.vars) y < rnorm(n.obs, sd = 3) + x %*% true.beta fit < oem(x = x, y = y, penalty = c("lasso", "grp.lasso", "sparse.grp.lasso"), groups = rep(1:10, each = 5)) layout(matrix(1:3, ncol = 3)) plot(fit) plot(fit, which.model = 2) plot(fit, which.model = "sparse.grp.lasso")# the oem package has support for # sparse design matrices library(Matrix) xs < rsparsematrix(n.obs * 25, n.vars * 2, density = 0.01) ys < rnorm(n.obs * 25, sd = 3) + as.vector(xs %*% c(true.beta, rep(0, n.vars)) ) x.dense < as.matrix(xs) system.time(fit < oem(x = x.dense, y = ys, penalty = c("lasso", "grp.lasso"), groups = rep(1:20, each = 5), intercept = FALSE, standardize = FALSE))#> user system elapsed #> 0.50 0.08 0.60system.time(fits < oem(x = xs, y = ys, penalty = c("lasso", "grp.lasso"), groups = rep(1:20, each = 5), intercept = FALSE, standardize = FALSE, lambda = fit$lambda))#> user system elapsed #> 0.06 0.00 0.06max(abs(fit$beta[[1]]  fits$beta[[1]]))#> [1] 1.332268e15max(abs(fit$beta[[2]]  fits$beta[[2]]))#> [1] 1.332268e15# logistic y < rbinom(n.obs, 1, prob = 1 / (1 + exp(x %*% true.beta))) system.time(res < oem(x, y, intercept = FALSE, penalty = c("lasso", "sparse.grp.lasso", "mcp"), family = "binomial", groups = rep(1:10, each = 5), nlambda = 10, irls.tol = 1e3, tol = 1e8))#> user system elapsed #> 0.16 0.00 0.17layout(matrix(1:3, ncol = 3)) plot(res) plot(res, which.model = 2) plot(res, which.model = "mcp")# sparse design matrix xs < rsparsematrix(n.obs * 2, n.vars, density = 0.01) x.dense < as.matrix(xs) ys < rbinom(n.obs * 2, 1, prob = 1 / (1 + exp(x %*% true.beta))) system.time(res.gr < oem(x.dense, ys, intercept = FALSE, penalty = "grp.lasso", family = "binomial", nlambda = 10, groups = rep(1:5, each = 10), irls.tol = 1e3, tol = 1e8))#> user system elapsed #> 0.19 0.02 0.21system.time(res.gr.s < oem(xs, ys, intercept = FALSE, penalty = "grp.lasso", family = "binomial", nlambda = 10, groups = rep(1:5, each = 10), irls.tol = 1e3, tol = 1e8))#> user system elapsed #> 0.04 0.00 0.07max(abs(res.gr$beta[[1]]  res.gr.s$beta[[1]]))#> [1] 1.134368e05