Software
personalized
Estimation and validation methods for subgroup identification and personalized medicine
Designed for the analysis of data where the effect of a treatment or intervention may vary across patients. Works with data from randomized controlled trials or observational studies. Provides fitting and validation of subgroup identification and personalized medicine models under the general subgroup identification framework of Chen et al. (2017).
forestBalance
High-dimensional distributional confounding control with adaptive forest weights
Implements the methodology of De and Huling (2025), constructing weights that minimize a measure of distributional distance between treated and control groups that emphasizes confounding variables. The approach works by first jointly modeling the relationship between covariates, outcome, and treatment and then creates a similarity measure that assesses how much two points share a similar confounding structure. This distance emphasizes balance of variables that affect both treatment and outcome, thereby dealing with high dimensional settings when not all variables are confounders.
independenceWeights
Confounding control for continuous-valued exposures
Implements the methodology of Huling, Greifer, and Chen (2021), constructing weights that minimize the weighted statistical dependence between a continuous treatment/exposure and a vector of confounders. Because confounding bias is a function of this dependence, these weights mitigate confounding bias directly without requiring a parametric propensity model.
swjm
Stagewise variable selection for joint models of semi-competing risks
Implements a stagewise variable selection framework for estimating equations and deploy it for two high-dimensional semi-competing risks models, joint frailty model and joint scale-change model. In addition to the lasso and group lasso penalties, we implement the cooperative lasso penalty, which encourages the same sign for effects of the same covariate across the recurrent and terminal event models.
hierNest
Penalized regression with hierarchical nested parameterization structure for heterogeneous populations
Implements a high-dimensional regression model framework when covariate effects vary by subgroups. The focus here is when subgroups are hierarchically defined, such as in readmission prediction settings when regression models may be heterogeneous based on what primary diagnosis an individual had in their initial hospitalization. These diagnoses can then be grouped into larger categories, creating a hierarchical structure. Our approach collapses models by covariate, borrowing strength across groups when needed.
oem
Orthogonalizing EM algorithm for penalized regression
Efficient computation for penalized regression models using the Orthogonalizing EM algorithm, designed for tall datasets. Supports lasso, MCP, SCAD, elastic net, group lasso, group MCP/SCAD, and more. Also available as a Julia implementation.
personalized2part
Two-part individualized treatment rules for semi-continuous data
Implements the methodology of Huling, Smith, and Chen (2020) for subgroup identification with semi-continuous outcomes. Uses a two-part (hurdle) framework to jointly model the binary and continuous components of the outcome, yielding a single treatment rule. High-dimensional settings are handled via a cooperative lasso penalty.
hierSDR
Hierarchical sufficient dimension reduction
Semiparametric sufficient dimension reduction for settings where population heterogeneity is defined by binary stratifying factors (e.g., chronic conditions in hospital risk modeling). Dimension reduction conforms to the hierarchical relationships between subpopulations, enabling tailored and interpretable models.
vennLasso
Variable selection for heterogeneous populations
Variable selection for high-dimensional models where population heterogeneity is defined by binary stratifying factors. Yields sparsity patterns that adhere to the hierarchical structure among subpopulations, enabling structured, interpretable variable selection across groups.
groupFusedMulti
Doubly structured variable selection for grouped multivariate outcomes
Penalized estimation for high-dimensional regression with multivariate outcomes that have a natural group structure. Implements the methodology of Huling et al. (2023).
personalizedLong
Fused comparative intervention scoring for long-term interventions
Estimation of individualized intervention rules for long-term treatments whose effects change smoothly over time and vary across a population. Implements the fused comparative intervention scoring methodology for heterogeneous longitudinal intervention effects.
aftiv
Instrumental variable estimation under the semiparametric AFT model
Instrumental variable estimation for time-to-event outcomes under the semiparametric accelerated failure time model.