MIT Stochastics & Statistics Seminar Series: Eric Tchetgen Tchetgen
Title: Next Generation Missing Data Methodology
Abstract: Missing data is a reality of empirical sciences and can rarely be prevented entirely. It is often assumed that incomplete data are missing completely at random (MCAR) or missing at random (MAR), When neither MCAR nor MAR, missingness is said to be Not MAR (NMAR). Under MAR, there are two main approaches to inference, likelihood/Bayesian inference, e.g. EM or MI, and semiparametric approaches such as Inverse probability weighting (IPW). In several important settings, likelihood based inferences suffer the difficulty of relying on modeling assumptions that may conflict with the model of substantive interest, while semiparametric methods clearly avoid this problem by relying on a model of the nonresponse process. In the common setting where missingness patterns are nonmonotone, it is difficult to model the nonresponse process without imposing an assumption stronger than MAR. This gap in the literature is resolved and a novel approach is proposed for modeling a nonmonote nonresponse process under MAR for use with IPW. Despite this solution, it is however argued that MAR is hard to justify on causal grounds when missingness is nonmonote, and therefore, new methods are considered for identification and inference when nonmonotone missing data are NMAR. Throughout, it is illustrated that the methods using both simulations and empirical illustrations in HIV research.
Bio: Eric Tchetgen Tchetgen is a Professor of Biostatistics and Epidemiologic Methods with joint appointment in the departments of Biostatistics and Epidemiology at the Harvard T.H. Chan School of Public Health. His primary area of interest is in semi-parametric efficiency theory with application to causal inference, missing data problems, statistical genetics and mixed model theory. In general, he works on the development of statistical and epidemiologic methods that make efficient use of the information in data collected by scientific investigators, while avoiding unnecessary assumptions about the underlying data generating mechanism.
For complete series listing please click here.