The LaLonde dataset comes from the National Supported Work Study, which sought to evaluate the effectiveness of an employment trainining program on wage increases.
LaLonde
A data frame with 722 observations and 12 variables:
whether earnings in 1978 are larger than in 1975; 1 for yes, 0 for no
whether the individual received the treatment; "Yes" or "No"
age in years
education in years
black or not; factor with levels "Yes" or "No"
hispanic or not; factor with levels "Yes" or "No"
white or not; factor with levels "Yes" or "No"
married or not; factor with levels "Yes" or "No"
No high school degree; factor with levels "Yes" (for no HS degree) or "No"
log of earnings in 1975
unemployed in 1975; factor with levels "Yes" or "No"
extrapolation weights to the 1978 Panel Study for Income Dynamics dataset
The National Supported Work Study.
LaLonde, R.J. 1986. "Evaluating the econometric evaulations of training programs with experimental data." American Economic Review, Vol.76, No.4, pp. 604-620.
Egami N, Ratkovic M, Imai K (2017). "FindIt: Finding Heterogeneous Treatment Effects." R
package version 1.1.2, https://CRAN.R-project.org/package=FindIt.
data(LaLonde)
y <- LaLonde$outcome
trt <- LaLonde$treat
x.varnames <- c("age", "educ", "black", "hisp", "white",
"marr", "nodegr", "log.re75", "u75")
# covariates
data.x <- LaLonde[, x.varnames]
# construct design matrix (with no intercept)
x <- model.matrix(~ -1 + ., data = data.x)
const.propens <- function(x, trt)
{
mean.trt <- mean(trt == "Trt")
rep(mean.trt, length(trt))
}
subgrp_fit_w <- fit.subgroup(x = x, y = y, trt = trt,
loss = "logistic_loss_lasso",
propensity.func = const.propens,
cutpoint = 0,
type.measure = "auc",
nfolds = 10)
summary(subgrp_fit_w)
#> family: binomial
#> loss: logistic_loss_lasso
#> method: weighting
#> cutpoint: 0
#> propensity
#> function: propensity.func
#>
#> benefit score: f(x),
#> Trt recom = Trt*I(f(x)>c)+Ctrl*I(f(x)<=c) where c is 'cutpoint'
#>
#> Average Outcomes:
#> Recommended Ctrl Recommended Trt
#> Received Ctrl 0.7292 (n = 48) 0.5146 (n = 377)
#> Received Trt 0.5714 (n = 28) 0.6059 (n = 269)
#>
#> Treatment effects conditional on subgroups:
#> Est of E[Y|T=Ctrl,Recom=Ctrl]-E[Y|T=/=Ctrl,Recom=Ctrl]
#> 0.1577 (n = 76)
#> Est of E[Y|T=Trt,Recom=Trt]-E[Y|T=/=Trt,Recom=Trt]
#> 0.0914 (n = 646)
#>
#> NOTE: The above average outcomes are biased estimates of
#> the expected outcomes conditional on subgroups.
#> Use 'validate.subgroup()' to obtain unbiased estimates.
#>
#> ---------------------------------------------------
#>
#> Benefit score quantiles (f(X) for Trt vs Ctrl):
#> 0% 25% 50% 75% 100%
#> -0.2889 0.1305 0.1305 0.1305 0.3843
#>
#> ---------------------------------------------------
#>
#> Summary of individual treatment effects:
#> E[Y|T=Trt, X] - E[Y|T=Ctrl, X]
#>
#> Min. 1st Qu. Median Mean 3rd Qu. Max.
#> -0.14345 0.06514 0.06514 0.06341 0.06514 0.18982
#>
#> ---------------------------------------------------
#>
#> 2 out of 10 interactions selected in total by the lasso (cross validation criterion).
#>
#> The first estimate is the treatment main effect, which is always selected.
#> Any other variables selected represent treatment-covariate interactions.
#>
#> Trt hispYes marrYes
#> Estimate 0.1305 -0.4194 0.2539