The LaLonde dataset comes from the National Supported Work Study, which sought to evaluate the effectiveness of an employment trainining program on wage increases.

LaLonde

Format

A data frame with 722 observations and 12 variables:

outcome

whether earnings in 1978 are larger than in 1975; 1 for yes, 0 for no

treat

whether the individual received the treatment; "Yes" or "No"

age

age in years

educ

education in years

black

black or not; factor with levels "Yes" or "No"

hisp

hispanic or not; factor with levels "Yes" or "No"

white

white or not; factor with levels "Yes" or "No"

marr

married or not; factor with levels "Yes" or "No"

nodegr

No high school degree; factor with levels "Yes" (for no HS degree) or "No"

log.re75

log of earnings in 1975

u75

unemployed in 1975; factor with levels "Yes" or "No"

wts.extrap

extrapolation weights to the 1978 Panel Study for Income Dynamics dataset

Source

The National Supported Work Study.

References

LaLonde, R.J. 1986. "Evaluating the econometric evaulations of training programs with experimental data." American Economic Review, Vol.76, No.4, pp. 604-620.

Egami N, Ratkovic M, Imai K (2017). "FindIt: Finding Heterogeneous Treatment Effects." R package version 1.1.2, https://CRAN.R-project.org/package=FindIt.

Examples

data(LaLonde)
y <- LaLonde$outcome

trt <- LaLonde$treat

x.varnames <- c("age", "educ", "black", "hisp", "white",
                "marr", "nodegr", "log.re75", "u75")

# covariates
data.x <- LaLonde[, x.varnames]

# construct design matrix (with no intercept)
x <- model.matrix(~ -1 + ., data = data.x)

const.propens <- function(x, trt)
{
    mean.trt <- mean(trt == "Trt")
    rep(mean.trt, length(trt))
}

subgrp_fit_w <- fit.subgroup(x = x, y = y, trt = trt,
    loss = "logistic_loss_lasso",
    propensity.func = const.propens,
    cutpoint = 0,
    type.measure = "auc",
    nfolds = 10)

summary(subgrp_fit_w)
#> family:    binomial 
#> loss:      logistic_loss_lasso 
#> method:    weighting 
#> cutpoint:  0 
#> propensity 
#> function:  propensity.func 
#> 
#> benefit score: f(x), 
#> Trt recom = Trt*I(f(x)>c)+Ctrl*I(f(x)<=c) where c is 'cutpoint'
#> 
#> Average Outcomes:
#>               Recommended Ctrl  Recommended Trt
#> Received Ctrl  0.7292 (n = 48) 0.5146 (n = 377)
#> Received Trt   0.5714 (n = 28) 0.6059 (n = 269)
#> 
#> Treatment effects conditional on subgroups:
#> Est of E[Y|T=Ctrl,Recom=Ctrl]-E[Y|T=/=Ctrl,Recom=Ctrl] 
#>                                        0.1577 (n = 76) 
#>     Est of E[Y|T=Trt,Recom=Trt]-E[Y|T=/=Trt,Recom=Trt] 
#>                                       0.0914 (n = 646) 
#> 
#> NOTE: The above average outcomes are biased estimates of
#>       the expected outcomes conditional on subgroups. 
#>       Use 'validate.subgroup()' to obtain unbiased estimates.
#> 
#> ---------------------------------------------------
#> 
#> Benefit score quantiles (f(X) for Trt vs Ctrl): 
#>      0%     25%     50%     75%    100% 
#> -0.2889  0.1305  0.1305  0.1305  0.3843 
#> 
#> ---------------------------------------------------
#> 
#> Summary of individual treatment effects: 
#> E[Y|T=Trt, X] - E[Y|T=Ctrl, X]
#> 
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -0.14345  0.06514  0.06514  0.06341  0.06514  0.18982 
#> 
#> ---------------------------------------------------
#> 
#> 2 out of 10 interactions selected in total by the lasso (cross validation criterion).
#> 
#> The first estimate is the treatment main effect, which is always selected. 
#> Any other variables selected represent treatment-covariate interactions.
#> 
#>             Trt hispYes marrYes
#> Estimate 0.1305 -0.4194  0.2539