MIT Stochastics & Statistics Seminar Series: Constantine Caramanis
“Fast algorithms and (other) min-max optimal algorithms for mixed regression”
Abstract: Mixture models represent the superposition of statistical processes, and are natural in machine learning and statistics. In mixed regression, the relationship between input and output is given by one of possibly several different (noisy) linear functions. Thus the solution encodes a combinatorial selection problem, and hence computing it is difficult in the worst case. Even in the average case, little is known in the realm of efficient algorithms with strong statistical guarantees.We give general conditions for linear convergence of an EM-like (and hence fast) algorithm for latent-variable problems in high dimensions, and show this implies that for sparse (or low-rank) mixed regression, EM converges linearly, in a neighborhood of the optimal solution, in the high-SNR regime. For the low-SNR regime, we show that a new behavior emerges. Here, we give a convex optimization formulation that provably recovers the true solution, and we provide upper bounds on the recovery errors for both arbitrary noise and stochastic noise settings. We also give matching minimax lower bounds, showing that our algorithm is information-theoretically optimal.Our results represent what is, as far as we know, the only tractable algorithm guaranteeing successful recovery with tight bounds on recovery errors and sample complexity.
Bio: Constantine Caramanis is an Associate Professor in the ECE department of The University of Texas at Austin. He received a PhD in EECS from MIT and an AB in Mathematics from Harvard University. His current research interests focus on decision-making in large-scale complex systems, with a focus on learning, computation and networks. He received the NSF CAREER award in 2011.
For complete series listing please click here.