This function fits penalized two part models with a logistic regression model for the zero part and a gamma regression model for the positive part. Each covariate's effect has either a group lasso or cooperative lasso penalty for its effects for the two consituent models

hd2part(
  x,
  z,
  x_s,
  s,
  weights = rep(1, NROW(x)),
  weights_s = rep(1, NROW(x_s)),
  offset = NULL,
  offset_s = NULL,
  penalty = c("grp.lasso", "coop.lasso"),
  penalty_factor = NULL,
  nlambda = 100L,
  lambda_min_ratio = ifelse(n_s < p, 0.05, 0.005),
  lambda = NULL,
  tau = 0,
  opposite_signs = FALSE,
  flip_beta_zero = FALSE,
  intercept_z = FALSE,
  intercept_s = FALSE,
  strongrule = TRUE,
  maxit_irls = 50,
  tol_irls = 1e-05,
  maxit_mm = 500,
  tol_mm = 1e-05,
  balance_likelihoods = TRUE
)

Arguments

x

an n x p matrix of covariates for the zero part data, where each row is an observation and each column is a predictor

z

a length n vector of responses taking values 1 and 0, where 1 indicates the response is positive and zero indicates the response has value 0.

x_s

an n_s x p matrix of covariates (which is a submatrix of x) for the positive part data, where each row is an observation and each column is a predictor

s

a length n_s vector of responses taking strictly positive values

weights

a length n vector of observation weights for the zero part data

weights_s

a length n_s vector of observation weights for the positive part data

offset

a length n vector of offset terms for the zero part data

offset_s

a length n_s vector of offset terms for the positive part data

penalty

either "grp.lasso" for the group lasso penalty or "coop.lasso" for the cooperative lasso penalty

penalty_factor

a length p vector of penalty adjustment factors corresponding to each covariate. A value of 0 in the jth location indicates no penalization on the jth variable, and any positive value will indicate a multiplicative factor on top of the common penalization amount. The default value is 1 for all variables

nlambda

the number of lambda values. The default is 100.

lambda_min_ratio

Smallest value for lambda, as a fraction of lambda.max, the data-derived largest lambda value The default depends on the sample size relative to the number of variables.

lambda

a user supplied sequence of penalization tuning parameters. By default, the program automatically chooses a sequence of lambda values based on nlambda and lambda_min_ratio

tau

value between 0 and 1 for sparse group mixing penalty. 0 implies either group lasso or coop lasso and 1 implies lasso

opposite_signs

a boolean variable indicating whether the signs of coefficients across models should be encouraged to have opposite signs instead of the same signs. Default is FALSE. This variable has no effect for group lasso.

flip_beta_zero

should we flip the signs of the parameters for the zero part model? Defaults to FALSE. Should only be used for good reason

intercept_z

whether or not to include an intercept in the zero part model. Default is TRUE.

intercept_s

whether or not to include an intercept in the positive part model. Default is TRUE.

strongrule

should a strong rule be used? Defaults to TRUE

maxit_irls

maximum number of IRLS iterations

tol_irls

convergence tolerance for IRLS iterations

maxit_mm

maximum number of MM iterations. Note that for algorithm = "irls", MM is used within each IRLS iteration, so maxit_mm applies to the convergence of the inner iterations in this case.

tol_mm

convergence tolerance for MM iterations. Note that for algorithm = "irls", MM is used within each IRLS iteration, so tol_mm applies to the convergence of the inner iterations in this case.

balance_likelihoods

should the likelihoods be balanced so variables would enter both models at the same value of lambda if the penalty were a lasso penalty? Recommended to keep at the default, TRUE

Examples