The MIT Statistics and Data Science Center hosts guest lecturers from around the world in the weekly Statistics and Data Science seminar series (formerly the Stochastics and Statistics Seminars).

Views Navigation

Event Views Navigation

Understanding and Improving the Safety of Frontier Models

Hamed Hassani (University of Pennsylvania)
E18-304

Abstract: (The talk will be self-contained and no background on LLM Safety/Alignment is required.) This talk provides a foundational overview of recent efforts in industry and academia to improve the safety of frontier models, along with open challenges. It will cover (1) principal approaches to designing red-teaming attacks, (2) in-model and out-of-model methods for enhancing safety, and (3) if time permits, the challenge of catastrophic forgetting in post-training and approaches to continual learning. Bio: Hamed Hassani is currently a senior research scientist…

Find out more »

Besting Good-Turing for probability estimation over large domains

Yihong Wu (Yale University)
E18-304

Abstract: When faced with a small sample from a large universe of possible outcomes, scientists often turn to the venerable Good-Turing estimator. Despite its pedigree, however, this estimator comes with considerable drawbacks, such as the need to hand-tune smoothing parameters and the lack of a precise optimality guarantee. We introduce a tuning-parameter-free estimator that bests Good-Turing in both theory and practice. Our method marries two classic ideas, namely Robbins' empirical Bayes and Kiefer-Wolfowitz's nonparametric maximum likelihood, to learn an implicit…

Find out more »

Formal Models of Language Generation

Jon Kleinberg (Cornell University)
E18-304

Abstract: The emergence of large language models has prompted a surge of interest into theoretical models that might give us insight into both their successes and their shortcomings. We'll give an overview of recent work in this direction, focusing on a surprising line of positive results that shows it is possible to give guarantees for language-generation algorithms even in the absence of any probabilistic assumptions, in a framework known as "language generation in the limit". These results suggest interesting notions…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764