R/augmentation_utils.R
create.propensity.function.Rd
Creates an propensity function that optionally utilizes cross-fitting
create.propensity.function(
crossfit = TRUE,
nfolds.crossfit = 10,
cv.glmnet.args = NULL
)
A logical value indicating whether to use cross-fitting (TRUE
) or not (FALSE
).
Cross-fitting is more computationally intensive, but helps to prevent overfitting, see Chernozhukov, et al. (2018)
An integer specifying the number of folds to use for cross-fitting. Must be greater than 1
A list of NAMED arguments to pass to the cv.glmnet
function. For
example, cv.glmnet.args = list(type.measure = "mse", nfolds = 10)
. See cv.glmnet
and glmnet
for all possible options.
A function which can be passed to the augment.func
argument of the fit.subgroup
function.
Chernozhukov, V., Chetverikov, D., Demirer, M., Duflo, E., Hansen, C., Newey, W., & Robins, J. (2018). Double/debiased machine learning for treatment and structural parameters https://arxiv.org/abs/1608.00060
fit.subgroup
for estimating ITRs and create.propensity.function
for creation of propensity functions
library(personalized)
set.seed(123)
n.obs <- 500
n.vars <- 15
x <- matrix(rnorm(n.obs * n.vars, sd = 3), n.obs, n.vars)
# simulate non-randomized treatment
xbetat <- 0.5 + 0.5 * x[,7] - 0.5 * x[,9]
trt.prob <- exp(xbetat) / (1 + exp(xbetat))
trt01 <- rbinom(n.obs, 1, prob = trt.prob)
trt <- 2 * trt01 - 1
# simulate response
# delta below drives treatment effect heterogeneity
delta <- 2 * (0.5 + x[,2] - x[,3] - x[,11] + x[,1] * x[,12] )
xbeta <- x[,1] + x[,11] - 2 * x[,12]^2 + x[,13] + 0.5 * x[,15] ^ 2
xbeta <- xbeta + delta * trt
# continuous outcomes
y <- drop(xbeta) + rnorm(n.obs, sd = 2)
aug.func <- create.augmentation.function(family = "gaussian",
crossfit = TRUE,
nfolds.crossfit = 10,
cv.glmnet.args = list(type.measure = "mae",
nfolds = 5))
prop.func <- create.propensity.function(crossfit = TRUE,
nfolds.crossfit = 10,
cv.glmnet.args = list(type.measure = "mae",
nfolds = 5))
subgrp.model <- fit.subgroup(x = x, y = y,
trt = trt01,
propensity.func = prop.func,
augment.func = aug.func,
loss = "sq_loss_lasso",
nfolds = 10) # option for cv.glmnet (for ITR estimation)
summary(subgrp.model)
#> family: gaussian
#> loss: sq_loss_lasso
#> method: weighting
#> cutpoint: 0
#> augmentation
#> function: augment.func
#> propensity
#> function: propensity.func
#>
#> benefit score: f(x),
#> Trt recom = 1*I(f(x)>c)+0*I(f(x)<=c) where c is 'cutpoint'
#>
#> Average Outcomes:
#> Recommended 0 Recommended 1
#> Received 0 -7.1071 (n = 84) -28.9549 (n = 117)
#> Received 1 -26.749 (n = 112) -3.0901 (n = 187)
#>
#> Treatment effects conditional on subgroups:
#> Est of E[Y|T=0,Recom=0]-E[Y|T=/=0,Recom=0]
#> 19.6419 (n = 196)
#> Est of E[Y|T=1,Recom=1]-E[Y|T=/=1,Recom=1]
#> 25.8647 (n = 304)
#>
#> NOTE: The above average outcomes are biased estimates of
#> the expected outcomes conditional on subgroups.
#> Use 'validate.subgroup()' to obtain unbiased estimates.
#>
#> ---------------------------------------------------
#>
#> Benefit score quantiles (f(X) for 1 vs 0):
#> 0% 25% 50% 75% 100%
#> -34.541 -5.529 3.090 11.518 42.140
#>
#> ---------------------------------------------------
#>
#> Summary of individual treatment effects:
#> E[Y|T=1, X] - E[Y|T=0, X]
#>
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -69.082 -11.058 6.181 6.974 23.036 84.280
#>
#> ---------------------------------------------------
#>
#> 10 out of 15 interactions selected in total by the lasso (cross validation criterion).
#>
#> The first estimate is the treatment main effect, which is always selected.
#> Any other variables selected represent treatment-covariate interactions.
#>
#> Trt1 V2 V3 V5 V6 V8 V9 V10 V11
#> Estimate 3.6115 2.3216 -2.4331 -0.0386 0.8201 1.2297 -0.3421 0.8635 -1.0772
#> V13 V15
#> Estimate 1.5338 1.2956