Stochastics & Statistics Seminar – Victor Chernozhukov (MIT)
April 29, 2016 | 11-12pm | 32-123

 Title: Double Machine Learning: Improved Point and Interval  Estimation of Treatment and Causal Parameters


Most supervised machine learning (ML) methods are explicitly designed to solve prediction problems very well.  Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average  treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly, for example, by  formally having inferior rates of convergence with respect to the sample size n caused by regularization bias.  Fortunately, this regularization bias can be removed by solving auxiliary  prediction problems via ML tools.  Specifically, we can form an efficient score for the target low-dimensional parameter by combining auxiliary and main ML predictions. The efficient score may then be used to build  an efficient estimator of the target parameter which typically will converge at the fastest possible 1/root(n) rate and be approximately unbiased and normal, and from which valid confidence intervals for these parameters of interest may be constructed.  The resulting method thus could be called a “double ML” method because it relies on estimating primary and auxiliary predictive models.  Such double ML estimators achieve the fastest rates of convergence and robustness of behavior with respect to a broader class of probability distributions than naive “single” ML estimators.  We illustrate the use of the proposed  methods with an application to estimating the effect of 401(k) eligibility on accumulated assets.

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 | Design by Opus