Loading Events

Past Events › Stochastics and Statistics Seminar Series

The MIT Statistics and Data Science Center hosts guest lecturers from around the world in this weekly seminar.

Events Search and Views Navigation

Event Views Navigation

September 2021

Interpolation and learning with scale dependent kernels

September 17, 2021 @ 11:00 am - 12:00 pm

Lorenzo Rosasco (MIT/Universita' di Genova)

E18-304

Speaker: Lorenzo Rosasco (MIT/Universita' di Genova) Title: Interpolation and learning with scale dependent kernels Abstract:  We study the learning properties of nonparametric ridge-less least squares. In particular, we consider the common case of estimators defined by scale dependent (Matern) kernels, and focus on the role scale and smoothness. These estimators interpolate the data and the scale can be shown to control their stability to noise and sampling.  Larger scales, corresponding to smoother functions, improve stability with respect to sampling. However, smaller…

Find out more »

May 2021

Likelihood-Free Frequentist Inference

May 14, 2021 @ 11:00 am - 12:00 pm

Ann Lee (Carnegie Mellon University)

online

Abstract: Many areas of the physical, engineering and biological sciences make extensive use of computer simulators to model complex systems. Confidence sets and hypothesis testing are the hallmarks of statistical inference, but classical methods are poorly suited for scientific applications involving complex simulators without a tractable likelihood. Recently, many techniques have been introduced that learn a surrogate likelihood using forward-simulated data, but these methods do not guarantee frequentist confidence sets and tests with nominal coverage and Type I error control,…

Find out more »

April 2021

Prioritizing genes from genome-wide association studies

April 23, 2021 @ 11:00 am - 12:00 pm

Hilary Finucane (Broad Institute)

online

Abstract: Prioritizing likely causal genes from genome-wide association studies (GWAS) is a fundamental problem. There are many methods for GWAS gene prioritization, including methods that map candidate SNPs to their target genes and methods that leverage patterns of enrichment from across the genome. In this talk, I will introduce a new method for leveraging genome-wide patterns of enrichment to prioritize genes at GWAS loci, incorporating information about genes from many sources. I will then discuss the problem of benchmarking gene prioritization methods,…

Find out more »

Sample Size Considerations in Precision Medicine

April 16, 2021 @ 11:00 am - 12:00 pm

Eric Laber (Duke University)

online

Abstract:  Sequential Multiple Assignment Randomized Trials (SMARTs) are considered the gold standard for estimation and evaluation of treatment regimes. SMARTs are typically sized to ensure sufficient power for a simple comparison, e.g., the comparison of two fixed treatment sequences. Estimation of an optimal treatment regime is conducted as part of a secondary and hypothesis-generating analysis with formal evaluation of the estimated optimal regime deferred to a follow-up trial. However, running a follow-up trial to evaluate an estimated optimal treatment regime…

Find out more »

Functions space view of linear multi-channel convolution networks with bounded weight norm

April 9, 2021 @ 11:00 am - 12:00 pm

Suriya Gunasekar (Microsoft Research)

online

Abstract: The magnitude of the weights of a neural network is a fundamental measure of complexity that plays a crucial role in the study of implicit and explicit regularization. For example, in recent work, gradient descent updates in overparameterized models asymptotically lead to solutions that implicitly minimize the ell_2 norm of the parameters of the model, resulting in an inductive bias that is highly architecture dependent. To investigate the properties of learned functions, it is natural to consider a function…

Find out more »

Sampler for the Wasserstein barycenter

April 2, 2021 @ 11:00 am - 12:00 pm

Thibaut Le Gouic (MIT)

online

Abstract: Wasserstein barycenters have become a central object in applied optimal transport as a tool to summarize complex objects that can be represented as distributions. Such objects include posterior distributions in Bayesian statistics, functions in functional data analysis and images in graphics. In a nutshell a Wasserstein barycenter is a probability distribution that provides a compelling summary of a finite set of input distributions. While the question of computing Wasserstein barycenters has received significant attention, this talk focuses on a…

Find out more »

March 2021

Testing the I.I.D. assumption online

March 26, 2021 @ 11:00 am - 12:00 pm

Vladimir Vovk (Royal Holloway, University of London )

online

Abstract: Mainstream machine learning, despite its recent successes, has a serious drawback: while its state-of-the-art algorithms often produce excellent predictions, they do not provide measures of their accuracy and reliability that would be both practically useful and provably valid. Conformal prediction adapts rank tests, popular in nonparametric statistics, to testing the IID assumption (the observations being independent and identically distributed). This gives us practical measures, provably valid under the IID assumption, of the accuracy and reliability of predictions produced by…

Find out more »

Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization

March 19, 2021 @ 11:00 am - 12:00 pm

Daniel Roy (University of Toronto)

online

Abstract:  We consider sequential prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. We quantify relaxations of the classical i.i.d. assumption in terms of these constraint sets, with i.i.d. sequences at one extreme and adversarial mechanisms at the other. The Hedge algorithm, long known to be minimax optimal in the adversarial regime, was recently shown to be minimax optimal for i.i.d. data. We show that Hedge with deterministic learning rates is suboptimal…

Find out more »

On nearly assumption-free tests of nominal confidence interval coverage for causal parameters estimated by machine learning

March 12, 2021 @ 11:00 am - 12:00 pm

James Robins (Harvard)

online

Abstract: For many causal effect parameters of interest, doubly robust machine learning (DRML) estimators ψ̂ 1 are the state-of-the-art, incorporating the good prediction performance of machine learning; the decreased bias of doubly robust estimators; and the analytic tractability and bias reduction of sample splitting with cross fitting. Nonetheless, even in the absence of confounding by unmeasured factors, the nominal (1−α) Wald confidence interval ψ̂ 1±zα/2ˆ may still undercover even in large samples, because the bias of ψ̂ 1 may be of the same…

Find out more »

Detection Thresholds for Distribution-Free Non-Parametric Tests: The Curious Case of Dimension 8

March 5, 2021 @ 11:00 am - 12:00 pm

Bhaswar B. Bhattacharya (University of Pennsylvania, Wharton School)

online

Abstract: Two of the fundamental problems in non-parametric statistical inference are goodness-of-fit and two-sample testing. These two problems have been extensively studied and several multivariate tests have been proposed over the last thirty years, many of which are based on geometric graphs. These include, among several others, the celebrated Friedman-Rafsky two-sample test based on the minimal spanning tree and the K-nearest neighbor graphs, and the Bickel-Breiman spacings tests for goodness-of-fit. These tests are asymptotically distribution-free, universally consistent, and computationally efficient…

Find out more »
+ Export Events

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |
      
Accessibility