Mathematics Colloquium: Matching Methods for Observational Data with Small Group Sizes and Missing Covariates
2019-09-27
4:10pm SPARK 212
Juanjuan Fan
In order to derive unbiased inference from observational data, matching methods are often applied to produce balanced treatment groups in terms of relevant background variables. Although many matching algorithms exit in the literature, most require a large control reservoir and can not deal with missing data. Random forest, averaging outcomes from many decision trees, is non-parametric in nature, can deal with missing data in the tree building process, and can produce more accurate and less model dependent estimates of propensity scores as well as a proximity matrix. In this study, iterative matching algorithms are developed in order to form balanced samples based on limited sample sizes for both groups. A R package implementing the proposed methods has also been developed. The proposed methods are applied to two data sets, arising from studies of autism spectrum disorder (ASD) and student success.