Loading Events

Past Events › Stochastics and Statistics Seminar Series

The MIT Statistics and Data Science Center hosts guest lecturers from around the world in this weekly seminar.

Events Search and Views Navigation

Event Views Navigation

October 2021

Breaking the Sample Size Barrier in Reinforcement Learning

October 15, 2021 @ 11:00 am - 12:00 pm

Yuting Wei (Wharton School at UPenn )


Abstract: Reinforcement learning (RL), which is frequently modeled as sequential learning and decision making in the face of uncertainty, is garnering growing interest in recent years due to its remarkable success in practice. In contemporary RL applications, it is increasingly more common to encounter environments with prohibitively large state and action space, thus imposing stringent requirements on the sample efficiency of the RL algorithms in use. Despite the empirical success, however, the theoretical underpinnings for many popular RL algorithms remain…

Find out more »

Recent results in planted assignment problems

October 8, 2021 @ 11:00 am - 12:00 pm

Yihong Wu (Yale University)


Abstract: Motivated by applications such as particle tracking, network de-anonymization, and computer vision, a recent thread of research is devoted to statistical models of assignment problems, in which the data are random weight graphs correlated with the latent permutation. In contrast to problems such as planted clique or stochastic block model, the major difference here is the lack of low-rank structures, which brings forth new challenges in both statistical analysis and algorithm design.   In the first half of the…

Find out more »

Causal Matrix Completion

October 1, 2021 @ 11:00 am - 12:00 pm

Devavrat Shah (MIT)


Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are “missing completely atrandom” (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of “latent confounders”, i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix.  In general, these…

Find out more »

September 2021

Representation and generalization

September 24, 2021 @ 11:00 am - 12:00 pm

Boaz Barak (Harvard University)


Abstract:  Self-supervised learning is an increasingly popular approach for learning representations of data that can be used for downstream representation tasks. A practical advantage of self-supervised learning is that it can be used on unlabeled data. However, even when labels are available, self-supervised learning can be competitive with the more "traditional" approach of supervised learning.   In this talk we consider "self supervised + simple classifier (SSS)" algorithms, which are obtained by first learning a self-supervised classifier on data, and…

Find out more »

Interpolation and learning with scale dependent kernels

September 17, 2021 @ 11:00 am - 12:00 pm

Lorenzo Rosasco (MIT/Universita' di Genova)


Speaker: Lorenzo Rosasco (MIT/Universita' di Genova) Title: Interpolation and learning with scale dependent kernels Abstract:  We study the learning properties of nonparametric ridge-less least squares. In particular, we consider the common case of estimators defined by scale dependent (Matern) kernels, and focus on the role scale and smoothness. These estimators interpolate the data and the scale can be shown to control their stability to noise and sampling.  Larger scales, corresponding to smoother functions, improve stability with respect to sampling. However, smaller…

Find out more »

May 2021

Likelihood-Free Frequentist Inference

May 14, 2021 @ 11:00 am - 12:00 pm

Ann Lee (Carnegie Mellon University)


Abstract: Many areas of the physical, engineering and biological sciences make extensive use of computer simulators to model complex systems. Confidence sets and hypothesis testing are the hallmarks of statistical inference, but classical methods are poorly suited for scientific applications involving complex simulators without a tractable likelihood. Recently, many techniques have been introduced that learn a surrogate likelihood using forward-simulated data, but these methods do not guarantee frequentist confidence sets and tests with nominal coverage and Type I error control,…

Find out more »

April 2021

Prioritizing genes from genome-wide association studies

April 23, 2021 @ 11:00 am - 12:00 pm

Hilary Finucane (Broad Institute)


Abstract: Prioritizing likely causal genes from genome-wide association studies (GWAS) is a fundamental problem. There are many methods for GWAS gene prioritization, including methods that map candidate SNPs to their target genes and methods that leverage patterns of enrichment from across the genome. In this talk, I will introduce a new method for leveraging genome-wide patterns of enrichment to prioritize genes at GWAS loci, incorporating information about genes from many sources. I will then discuss the problem of benchmarking gene prioritization methods,…

Find out more »

Sample Size Considerations in Precision Medicine

April 16, 2021 @ 11:00 am - 12:00 pm

Eric Laber (Duke University)


Abstract:  Sequential Multiple Assignment Randomized Trials (SMARTs) are considered the gold standard for estimation and evaluation of treatment regimes. SMARTs are typically sized to ensure sufficient power for a simple comparison, e.g., the comparison of two fixed treatment sequences. Estimation of an optimal treatment regime is conducted as part of a secondary and hypothesis-generating analysis with formal evaluation of the estimated optimal regime deferred to a follow-up trial. However, running a follow-up trial to evaluate an estimated optimal treatment regime…

Find out more »

Functions space view of linear multi-channel convolution networks with bounded weight norm

April 9, 2021 @ 11:00 am - 12:00 pm

Suriya Gunasekar (Microsoft Research)


Abstract: The magnitude of the weights of a neural network is a fundamental measure of complexity that plays a crucial role in the study of implicit and explicit regularization. For example, in recent work, gradient descent updates in overparameterized models asymptotically lead to solutions that implicitly minimize the ell_2 norm of the parameters of the model, resulting in an inductive bias that is highly architecture dependent. To investigate the properties of learned functions, it is natural to consider a function…

Find out more »

Sampler for the Wasserstein barycenter

April 2, 2021 @ 11:00 am - 12:00 pm

Thibaut Le Gouic (MIT)


Abstract: Wasserstein barycenters have become a central object in applied optimal transport as a tool to summarize complex objects that can be represented as distributions. Such objects include posterior distributions in Bayesian statistics, functions in functional data analysis and images in graphics. In a nutshell a Wasserstein barycenter is a probability distribution that provides a compelling summary of a finite set of input distributions. While the question of computing Wasserstein barycenters has received significant attention, this talk focuses on a…

Find out more »
+ Export Events

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |