The MIT Statistics and Data Science Center hosts guest lecturers from around the world in this weekly seminar.

Views Navigation

Event Views Navigation

The Geometry of Particle Collisions: Hidden in Plain Sight

Jesse Thaler (MIT)
E18-304

Abstract: Since the 1960s, particle physicists have developed a variety of data analysis strategies for the goal of comparing experimental measurements to theoretical predictions.  Despite their numerous successes, these techniques can seem esoteric and ad hoc, even to practitioners in the field.  In this talk, I explain how many particle physics analysis tools have a natural geometric interpretation in an emergent "space" of collider events induced by the Wasserstein metric.  This in turn suggests new analysis strategies to interpret generic…

Find out more »

The Brownian transport map

Dan Mikulincer (MIT)
E18-304

Abstract: The existence of a transport map from the standard Gaussian leads to succinct​representations for, potentially complicated, measures.​ Inspired by result from optimal transport, we introduce the Brownian transport map that pushes forward the Wiener measure to a target measure in a finite-dimensional Euclidean space. Using tools from Ito's and Malliavin's calculus, we show that the map is Lipschitz in several cases of interest. Specifically, our results apply when the target measure satisfies one of the following: - More log-concave than the Gaussian, recovering…

Find out more »

On the power of Lenstra-Lenstra-Lovasz in noiseless inference

Ilias Zadik (MIT)
E18-304

Abstract:   In this talk, we are going to discuss a new polynomial-time algorithmic framework for inference problems, based on the celebrated Lenstra-Lenstra-Lovasz lattice basis reduction algorithm. Potentially surprisingly, this algorithmic framework is able to successfully bypass multiple suggested notions of “computational hardness for inference” for various noiseless settings. Such settings include 1) sparse regression, where there is Overlap Gap Property and low-degree methods fail, 2) phase retrieval where Approximate Message Passing fails and 3) Gaussian clustering where the SoS…

Find out more »

Optimal testing for calibration of predictive models

Edgar Dobriban (University of Pennsylvania)
E18-304

Abstract:   The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider the problem of detecting mis-calibration of…

Find out more »

Inference on Winners

Isaiah Andrews (Harvard University)
E18-304

Abstract: Many empirical questions concern target parameters selected through optimization. For example, researchers may be interested in the effectiveness of the best policy found in a randomized trial, or the best-performing investment strategy based on historical data. Such settings give rise to a winner's curse, where conventional estimates are biased and conventional confidence intervals are unreliable. This paper develops optimal confidence intervals and median-unbiased estimators that are valid conditional on the target selected and so overcome this winner's curse. If…

Find out more »

Mean-field approximations for high-dimensional Bayesian Regression

Subhabrata Sen (Harvard University)
E18-304

Abstract: Variational approximations provide an attractive computational alternative to MCMC-based strategies for approximating the posterior distribution in Bayesian inference. Despite their popularity in applications, supporting theoretical guarantees are limited, particularly in high-dimensional settings. In the first part of the talk, we will study bayesian inference in the context of a linear model with product priors, and derive sufficient conditions for the correctness (to leading order) of the naive mean-field approximation. To this end, we will utilize recent advances in the…

Find out more »

The query complexity of certification

Li-Yang Tan (Stanford University)
E18-304

Abstract: We study the problem of certification: given queries to an n-variable boolean function f with certificate complexity k and an input x, output a size-k certificate for f's value on x. This abstractly models a problem of interest in explainable machine learning, where we think of f as a blackbox model that we seek to explain the predictions of. For monotone functions, classic algorithms of Valiant and Angluin accomplish this task with n queries to f. Our main result is…

Find out more »

Causal Representation Learning – A Proposal

Caroline Uhler (MIT)
E18-304

Abstract: The development of CRISPR-based assays and small molecule screens holds the promise of engineering precise cell state transitions to move cells from one cell type to another or from a diseased state to a healthy state. The main bottleneck is the huge space of possible perturbations/interventions, where even with the breathtaking technological advances in single-cell biology it will never be possible to experimentally perturb all combinations of thousands of genes or compounds. This important biological problem calls for a…

Find out more »

Learning with Random Features and Kernels: Sharp Asymptotics and Universality Laws

Yue M. Lu (Harvard University)
E18-304

Abstract:  Many new random matrix ensembles arise in learning and modern signal processing. As shown in recent studies, the spectral properties of these matrices help answer crucial questions regarding the training and generalization performance of neural networks, and the fundamental limits of high-dimensional signal recovery. As a result, there has been growing interest in precisely understanding the spectra and other asymptotic properties of these matrices. Unlike their classical counterparts, these new random matrices are often highly structured and are the…

Find out more »

Is quantile regression a suitable method to understand tax incentives for charitable giving? Case study from the Canton of Geneva, Switzerland

Giedre Lideikyte Huber and Marta Pittavino (University of Geneva)
E18-304

Abstract:  Under the current Swiss law, taxpayers can deduct charitable donations from their individual’s taxable income subject to a 20%-ceiling. This deductible ceiling was increased at the communal and cantonal level from a previous 5%-ceiling in 2009. The goal of the reform was boosting charitable giving to non-profit entities. However, the effects of this reform, and more generally of the existing Swiss system of tax deductions for charitable giving has never been empirically studied. The aim of this work is…

Find out more »


© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |
      
Accessibility