Loading Events

Past Events › Stochastics and Statistics Seminar Series

The MIT Statistics and Data Science Center hosts guest lecturers from around the world in this weekly seminar.

Events Search and Views Navigation

Event Views Navigation

December 2021

The Geometry of Particle Collisions: Hidden in Plain Sight

December 3, 2021 @ 11:00 am - 12:00 pm

Jesse Thaler (MIT)


Abstract: Since the 1960s, particle physicists have developed a variety of data analysis strategies for the goal of comparing experimental measurements to theoretical predictions.  Despite their numerous successes, these techniques can seem esoteric and ad hoc, even to practitioners in the field.  In this talk, I explain how many particle physics analysis tools have a natural geometric interpretation in an emergent "space" of collider events induced by the Wasserstein metric.  This in turn suggests new analysis strategies to interpret generic…

Find out more »

November 2021

Precise high-dimensional asymptotics for AdaBoost via max-margins & min-norm interpolants

November 19, 2021 @ 11:00 am - 12:00 pm

Pragya Sur (Harvard University)


Abstract: This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the max-min-L1-margin and the min-L1-norm interpolant. In turn, this will characterize the…

Find out more »

Characterizing the Type 1-Type 2 Error Trade-off for SLOPE

November 12, 2021 @ 11:00 am - 12:00 pm

Cynthia Rush (Columbia University)


Abstract: Sorted L1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this talk, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I and type II error. Additionally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms the Lasso,…

Find out more »

Asymptotics of learning on dependent and structured random objects

November 5, 2021 @ 11:00 am - 12:00 pm

Morgane Austern (Harvard University)


Abstract:  Classical statistical inference relies on numerous tools from probability theory to study the properties of estimators. However, these same tools are often inadequate to study modern machine problems that frequently involve structured data (e.g networks) or complicated dependence structures (e.g dependent random matrices). In this talk, we extend universal limit theorems beyond the classical setting. Firstly, we consider distributionally \structured" and dependent random object i.e random objects whose distribution are invariant under the action of an amenable group. We…

Find out more »

October 2021

Revealing the simplicity of high-dimensional objects via pathwise analysis

October 29, 2021 @ 11:00 am - 12:00 pm

Ronen Eldan (Weizmann Inst. of Science and Princeton)


Abstract: One of the main reasons behind the success of high-dimensional statistics and modern machine learning in taming the curse of dimensionality is that many classes of high-dimensional distributions are surprisingly well-behaved and, when viewed correctly, exhibit a simple structure. This emergent simplicity is in the center of the theory of “high-dimensional phenomena”, and is manifested in principles such as “Gaussian-like behavior” (objects of interest often inherit the properties of the Gaussian measure), “dimension-free behavior” (expressed in inequalities which do…

Find out more »

Instance Dependent PAC Bounds for Bandits and Reinforcement Learning

October 22, 2021 @ 11:00 am - 12:00 pm

Kevin Jamieson (University of Washington)


Abstract: The sample complexity of an interactive learning problem, such as multi-armed bandits or reinforcement learning, is the number of interactions with nature required to output an answer (e.g., a recommended arm or policy) that is approximately close to optimal with high probability. While minimax guarantees can be useful rules of thumb to gauge the difficulty of a problem class, algorithms optimized for this worst-case metric often fail to adapt to “easy” instances where fewer samples suffice. In this talk, I…

Find out more »

Breaking the Sample Size Barrier in Reinforcement Learning

October 15, 2021 @ 11:00 am - 12:00 pm

Yuting Wei (Wharton School at UPenn )


Abstract: Reinforcement learning (RL), which is frequently modeled as sequential learning and decision making in the face of uncertainty, is garnering growing interest in recent years due to its remarkable success in practice. In contemporary RL applications, it is increasingly more common to encounter environments with prohibitively large state and action space, thus imposing stringent requirements on the sample efficiency of the RL algorithms in use. Despite the empirical success, however, the theoretical underpinnings for many popular RL algorithms remain…

Find out more »

Recent results in planted assignment problems

October 8, 2021 @ 11:00 am - 12:00 pm

Yihong Wu (Yale University)


Abstract: Motivated by applications such as particle tracking, network de-anonymization, and computer vision, a recent thread of research is devoted to statistical models of assignment problems, in which the data are random weight graphs correlated with the latent permutation. In contrast to problems such as planted clique or stochastic block model, the major difference here is the lack of low-rank structures, which brings forth new challenges in both statistical analysis and algorithm design.   In the first half of the…

Find out more »

Causal Matrix Completion

October 1, 2021 @ 11:00 am - 12:00 pm

Devavrat Shah (MIT)


Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are “missing completely atrandom” (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of “latent confounders”, i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix.  In general, these…

Find out more »

September 2021

Representation and generalization

September 24, 2021 @ 11:00 am - 12:00 pm

Boaz Barak (Harvard University)


Abstract:  Self-supervised learning is an increasingly popular approach for learning representations of data that can be used for downstream representation tasks. A practical advantage of self-supervised learning is that it can be used on unlabeled data. However, even when labels are available, self-supervised learning can be competitive with the more "traditional" approach of supervised learning.   In this talk we consider "self supervised + simple classifier (SSS)" algorithms, which are obtained by first learning a self-supervised classifier on data, and…

Find out more »
+ Export Events

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |