Views Navigation

Event Views Navigation

Understanding and Improving the Safety of Frontier Models

Hamed Hassani (University of Pennsylvania)
E18-304

Abstract: (The talk will be self-contained and no background on LLM Safety/Alignment is required.) This talk provides a foundational overview of recent efforts in industry and academia to improve the safety of frontier models, along with open challenges. It will cover (1) principal approaches to designing red-teaming attacks, (2) in-model and out-of-model methods for enhancing safety, and (3) if time permits, the challenge of catastrophic forgetting in post-training and approaches to continual learning. Bio: Hamed Hassani is currently a senior research scientist…

Find out more »

Besting Good-Turing for probability estimation over large domains

Yihong Wu (Yale University)
E18-304

Abstract: When faced with a small sample from a large universe of possible outcomes, scientists often turn to the venerable Good-Turing estimator. Despite its pedigree, however, this estimator comes with considerable drawbacks, such as the need to hand-tune smoothing parameters and the lack of a precise optimality guarantee. We introduce a tuning-parameter-free estimator that bests Good-Turing in both theory and practice. Our method marries two classic ideas, namely Robbins' empirical Bayes and Kiefer-Wolfowitz's nonparametric maximum likelihood, to learn an implicit…

Find out more »

Formal Models of Language Generation

Jon Kleinberg (Cornell University)
E18-304

Abstract: The emergence of large language models has prompted a surge of interest into theoretical models that might give us insight into both their successes and their shortcomings. We'll give an overview of recent work in this direction, focusing on a surprising line of positive results that shows it is possible to give guarantees for language-generation algorithms even in the absence of any probabilistic assumptions, in a framework known as "language generation in the limit". These results suggest interesting notions…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764