R/fit_subgroup_two_part.R
fit_subgroup_2part.Rd
Fits subgroup identification models
fit_subgroup_2part( x, y, trt, propensity.func = NULL, propensity.func.positive = NULL, match.id = NULL, augment.func.zero = NULL, augment.func.positive = NULL, cutpoint = 1, larger.outcome.better = TRUE, penalize.ate = TRUE, y_eps = 1e-06, ... )
x | The design matrix (not including intercept term) |
---|---|
y | The nonnegative response vector |
trt | treatment vector with each element equal to a 0 or a 1, with 1 indicating treatment status is active. |
propensity.func | function that inputs the design matrix x and the treatment vector trt and outputs
the propensity score, ie Pr(trt = 1 | X = x). Function should take two arguments 1) x and 2) trt. See example below.
For a randomized controlled trial this can simply be a function that returns a constant equal to the proportion
of patients assigned to the treatment group, i.e.:
|
propensity.func.positive | function that inputs the design matrix x and the treatment vector trt and outputs
the propensity score for units with positive outcome values, ie Pr(trt = 1 | X = x, Z = 1). Function should take
two arguments 1) x and 2) trt. See example below.
For a randomized controlled trial this can simply be a function that returns a constant equal to the proportion
of patients assigned to the treatment group, i.e.:
|
match.id | a (character, factor, or integer) vector with length equal to the number of observations in Example 1: Example 2: augment.func <- function(x, y, trt) { data <- data.frame(x, y, trt) lmod <- glm(y ~ x * trt, family = binomial()) ## get predictions when trt = 1 data$trt <- 1 preds_1 <- predict(lmod, data, type = "response") ## get predictions when trt = -1 data$trt <- -1 preds_n1 <- predict(lmod, data, type = "response") ## return predictions averaged over trt return(0.5 * (preds_1 + preds_n1)) } |
augment.func.zero | (similar to augment.func.positive) function which inputs the
indicators of whether each response is positive ( |
augment.func.positive | (similar to augment.func.zero) function which inputs the positive part response
(ie all observations in |
cutpoint | numeric value for patients with benefit scores above which
(or below which if |
larger.outcome.better | boolean value of whether a larger outcome is better/preferable. Set to |
penalize.ate | should the treatment main effect (ATE) be penalized too? |
y_eps | positive value above which observations in |
... | options to be passed to |
set.seed(42) dat <- sim_semicontinuous_data(250, n.vars = 15) x <- dat$x y <- dat$y trt <- dat$trt prop_func <- function(x, trt) { propensmod <- glm(trt ~ x, family = binomial()) propens <- unname(fitted(propensmod)) propens } fitted_model <- fit_subgroup_2part(x, y, trt, prop_func, prop_func) fitted_model#> family: 2part #> loss: 2part #> method: #> cutpoint: 1 #> propensity #> function: propensity.func #> #> benefit score: f(x), #> Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint' #> #> Average Outcomes: #> Recommended 0 Recommended 1 #> Received 0 1.5903 (n = 62) 0.1273 (n = 66) #> Received 1 0.1498 (n = 25) 4.4873 (n = 97) #> #> Treatment effects conditional on subgroups: #> Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0] #> 1.4406 (n = 87) #> Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1] #> 4.36 (n = 163) #> #> NOTE: The above average outcomes are biased estimates of #> the expected outcomes conditional on subgroups. #> Use 'validate.subgroup()' to obtain unbiased estimates. #> #> --------------------------------------------------- #> #> Benefit score quantiles (f(X) for 1 vs 0): #> 0% 25% 50% 75% 100% #> 1.706e-04 4.662e-01 5.409e+00 5.855e+01 1.649e+05 #> #> --------------------------------------------------- #> #> Summary of individual treatment effects: #> E[Y|T=1, X] / E[Y|T=0, X] #> #> Note: for survival outcomes, the above ratio is #> E[g(Y)|T=1, X] / E[g(Y)|T=0, X], #> where g() is a monotone increasing function of Y, #> the survival time #> #> Min. 1st Qu. Median Mean 3rd Qu. Max. #> 0.00 0.47 5.41 1057.68 58.55 164909.90## correlation of estimated covariate-conditional risk ratio and truth cor(fitted_model$benefit.scores, dat$treatment_risk_ratio, method = "spearman")#> [,1] #> [1,] 0.07631469