# Past Events › Stochastics and Statistics Seminar Series

The MIT Statistics and Data Science Center hosts guest lecturers from around the world in this weekly seminar.

## Faster and Simpler Algorithms for List Learning

February 19, 2021 @ 11:00 am - 12:00 pm

Jerry Li (Microsoft Research)

Online

Abstract: The goal of list learning is to understand how to learn basic statistics of a dataset when it has been corrupted by an overwhelming fraction of outliers. More formally, one is given a set of points $S$, of which an $\alpha$-fraction $T$ are promised to be well-behaved. The goal is then to output an $O(1 / \alpha)$ sized list of candidate means, so that one of these candidates is close to the true mean of the points in $T$.…

Find out more »

## Perfect Simulation for Feynman-Kac Models using Ensemble Rejection Sampling

November 20, 2020 @ 11:00 am - 12:00 pm

Arnaud Doucet (University of Oxford)

online

Abstract: I will introduce Ensemble Rejection Sampling, a scheme for perfect simulation of a class of Feynmac-Kac models. In particular, this scheme allows us to sample exactly from the posterior distribution of the latent states of a class of non-linear non-Gaussian state-space models and from the distribution of a class of conditioned random walks. Ensemble Rejection Sampling relies on a high-dimensional proposal distribution built using ensembles of state samples and dynamic programming. Although this algorithm can be interpreted as a…

Find out more »

## Sharp Thresholds for Random Subspaces, and Applications

November 13, 2020 @ 11:00 am - 12:00 pm

Mary Wootters (Stanford University )

online

Abstract: What combinatorial properties are likely to be satisfied by a random subspace over a finite field? For example, is it likely that not too many points lie in any Hamming ball? What about any cube?  We show that there is a sharp threshold on the dimension of the subspace at which the answers to these questions change from "extremely likely" to "extremely unlikely," and moreover we give a simple characterization of this threshold for different properties. Our motivation comes…

Find out more »

## Valid hypothesis testing after hierarchical clustering

November 6, 2020 @ 11:00 am - 12:00 pm

Daniela Witten (University of Washington)

online

Abstract:  As datasets continue to grow in size, in many settings the focus of data collection has shifted away from testing pre-specified hypotheses, and towards hypothesis generation. Researchers are often interested in performing an exploratory data analysis in order to generate hypotheses, and then testing those hypotheses on the same data; I will refer to this as ‘double dipping’. Unfortunately, double dipping can lead to highly-inflated Type 1 errors. In this talk, I will consider the special case of hierarchical…

Find out more »

## Statistical Aspects of Wasserstein Distributionally Robust Optimization Estimators

October 23, 2020 @ 11:00 am - 12:00 pm

Jose Blanchet (Stanford University)

online

Abstract: Wasserstein-based distributional robust optimization problems are formulated as min-max games in which a statistician chooses a parameter to minimize an expected loss against an adversary (say nature) which wishes to maximize the loss by choosing an appropriate probability model within a certain non-parametric class. Recently, these formulations have been studied in the context in which the non-parametric class chosen by nature is defined as a Wasserstein-distance neighborhood around the empirical measure. It turns out that by appropriately choosing the…

Find out more »

## Data driven variational models for solving inverse problems

October 16, 2020 @ 11:00 am - 12:00 pm

Carola-Bibiane Schönlieb (University of Cambridge )

online

Abstract:  In this talk we discuss the idea of data- driven regularisers for inverse imaging problems. We are in particular interested in the combination of mathematical models and purely data-driven approaches, getting the best from both worlds. In this context we will make a journey from “shallow” learning for computing optimal parameters for variational regularisation models by bilevel optimization to the investigation of different approaches that use deep neural networks for solving inverse imaging problems. Bio: Carola-Bibiane Schönlieb is Professor of…

Find out more »

## On Estimating the Mean of a Random Vector

October 9, 2020 @ 11:00 am - 12:00 pm

Gábor Lugosi (Pompeu Fabra University )

online

Abstract: One of the most basic problems in statistics is the estimation of the mean of a random vector, based on independent observations. This problem has received renewed attention in the last few years, both from statistical and computational points of view. In this talk we review some recent results on the statistical performance of mean estimators that allow heavy tails and adversarial contamination in the data. The basic punchline is that one can construct estimators that, under minimal conditions,…

Find out more »

## Bayesian inverse problems, Gaussian processes, and partial differential equations

October 2, 2020 @ 11:00 am - 12:00 pm

Richard Nickl (University of Cambridge)

online

Abstract: The Bayesian approach to inverse problems has become very popular in the last decade after seminal work by Andrew Stuart (2010) and collaborators. Particularly in non-linear applications with PDEs and when using Gaussian process priors, this can leverage powerful MCMC methodology to tackle difficult high-dimensional and non-convex inference problems. Little is known in terms of rigorous performance guarantees for such algorithms. After laying out the main ideas behind Bayesian inversion, we will discuss recent progress providing both statistical and…

Find out more »

## Separating Estimation from Decision Making in Contextual Bandits

September 25, 2020 @ 11:00 am - 12:00 pm

Dylan Foster (MIT)

online

Abstract: The contextual bandit is a sequential decision making problem in which a learner repeatedly selects an action (e.g., a news article to display) in response to a context (e.g., a user’s profile) and receives a reward, but only for the action they selected. Beyond the classic explore-exploit tradeoff, a fundamental challenge in contextual bandits is to develop algorithms that can leverage flexible function approximation to model similarity between contexts, yet have computational requirements comparable to classical supervised learning tasks…

Find out more »

## Causal Inference and Overparameterized Autoencoders in the Light of Drug Repurposing for SARS-CoV-2

September 18, 2020 @ 11:00 am - 12:00 pm

Caroline Uhler (MIT)

online

Abstract:  Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation / intervention data (drugs, knockouts, overexpression, etc.) in biology. In order to obtain mechanistic insights from such data, a major challenge is the development of a framework that integrates observational and interventional data and allows predicting the effect of yet unseen interventions or transporting the effect of interventions…

Find out more »

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |

Accessibility