Views Navigation

Event Views Navigation

Understanding and Improving the Safety of Frontier Models

Hamed Hassani (Unviersity of Pennsylvania)
E18-304

Abstract: (The talk will be self-contained and no background on LLM Safety/Alignment is required.) This talk provides a foundational overview of recent efforts in industry and academia to improve the safety of frontier models, along with open challenges. It will cover (1) principal approaches to designing red-teaming attacks, (2) in-model and out-of-model methods for enhancing safety, and (3) if time permits, the challenge of catastrophic forgetting in post-training and approaches to continual learning. Bio: Hamed Hassani is currently a senior research scientist…

Find out more »

Formal Models of Language Generation

Jon Kleinberg (Cornell University)
E18-304

Abstract: The emergence of large language models has prompted a surge of interest into theoretical models that might give us insight into both their successes and their shortcomings. We'll give an overview of recent work in this direction, focusing on a surprising line of positive results that shows it is possible to give guarantees for language-generation algorithms even in the absence of any probabilistic assumptions, in a framework known as "language generation in the limit". These results suggest interesting notions…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764