National Supported Work Study Data — LaLonde • personalized

The LaLonde dataset comes from the National Supported Work Study, which sought to evaluate the effectiveness of an employment trainining program on wage increases.

LaLonde

Format

A data frame with 722 observations and 12 variables:

outcome: whether earnings in 1978 are larger than in 1975; 1 for yes, 0 for no
treat: whether the individual received the treatment; "Yes" or "No"
age: age in years
educ: education in years
black: black or not; factor with levels "Yes" or "No"
hisp: hispanic or not; factor with levels "Yes" or "No"
white: white or not; factor with levels "Yes" or "No"
marr: married or not; factor with levels "Yes" or "No"
nodegr: No high school degree; factor with levels "Yes" (for no HS degree) or "No"
log.re75: log of earnings in 1975
u75: unemployed in 1975; factor with levels "Yes" or "No"
wts.extrap: extrapolation weights to the 1978 Panel Study for Income Dynamics dataset

Source

The National Supported Work Study.

References

LaLonde, R.J. 1986. "Evaluating the econometric evaulations of training programs with experimental data." American Economic Review, Vol.76, No.4, pp. 604-620.

Egami N, Ratkovic M, Imai K (2017). "FindIt: Finding Heterogeneous Treatment Effects." R package version 1.1.2, https://CRAN.R-project.org/package=FindIt.

Examples

data(LaLonde)
y <- LaLonde$outcome

trt <- LaLonde$treat

x.varnames <- c("age", "educ", "black", "hisp", "white",
                "marr", "nodegr", "log.re75", "u75")

# covariates
data.x <- LaLonde[, x.varnames]

# construct design matrix (with no intercept)
x <- model.matrix(~ -1 + ., data = data.x)

const.propens <- function(x, trt)
{
    mean.trt <- mean(trt == "Trt")
    rep(mean.trt, length(trt))
}

subgrp_fit_w <- fit.subgroup(x = x, y = y, trt = trt,
    loss = "logistic_loss_lasso",
    propensity.func = const.propens,
    cutpoint = 0,
    type.measure = "auc",
    nfolds = 10)

summary(subgrp_fit_w)
#> family:    binomial 
#> loss:      logistic_loss_lasso 
#> method:    weighting 
#> cutpoint:  0 
#> propensity 
#> function:  propensity.func 
#> 
#> benefit score: f(x), 
#> Trt recom = Trt*I(f(x)>c)+Ctrl*I(f(x)<=c) where c is 'cutpoint'
#> 
#> Average Outcomes:
#>               Recommended Ctrl  Recommended Trt
#> Received Ctrl  0.7292 (n = 48) 0.5146 (n = 377)
#> Received Trt   0.5714 (n = 28) 0.6059 (n = 269)
#> 
#> Treatment effects conditional on subgroups:
#> Est of E[Y|T=Ctrl,Recom=Ctrl]-E[Y|T=/=Ctrl,Recom=Ctrl] 
#>                                        0.1577 (n = 76) 
#>     Est of E[Y|T=Trt,Recom=Trt]-E[Y|T=/=Trt,Recom=Trt] 
#>                                       0.0914 (n = 646) 
#> 
#> NOTE: The above average outcomes are biased estimates of
#>       the expected outcomes conditional on subgroups. 
#>       Use 'validate.subgroup()' to obtain unbiased estimates.
#> 
#> ---------------------------------------------------
#> 
#> Benefit score quantiles (f(X) for Trt vs Ctrl): 
#>      0%     25%     50%     75%    100% 
#> -0.2889  0.1305  0.1305  0.1305  0.3843 
#> 
#> ---------------------------------------------------
#> 
#> Summary of individual treatment effects: 
#> E[Y|T=Trt, X] - E[Y|T=Ctrl, X]
#> 
#>     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
#> -0.14345  0.06514  0.06514  0.06341  0.06514  0.18982 
#> 
#> ---------------------------------------------------
#> 
#> 2 out of 10 interactions selected in total by the lasso (cross validation criterion).
#> 
#> The first estimate is the treatment main effect, which is always selected. 
#> Any other variables selected represent treatment-covariate interactions.
#> 
#>             Trt hispYes marrYes
#> Estimate 0.1305 -0.4194  0.2539