The MIT Statistics and Data Science Center hosts guest lecturers from around the world in the weekly Statistics and Data Science seminar series (formerly the Stochastics and Statistics Seminars).

Views Navigation

Event Views Navigation

Hard-Constrained Neural Networks

Navid Azizan (MIT)
E18-304

Abstract: Incorporating prior knowledge and domain-specific input-output requirements, such as safety or stability, as hard constraints into neural networks is a key enabler for their deployment in high-stakes applications. However, existing methods often rely on soft penalties, which are insufficient, especially on out-of-distribution samples. In this talk, I will introduce hard-constrained neural networks (HardNet), a general framework for enforcing hard, input-dependent constraints by appending a differentiable enforcement layer to any neural network. This approach enables end-to-end training and, crucially, is…

Find out more »

Attention Sinks: A ‘Catch, Tag, Release’ Mechanism for Embeddings

Vardan Papyan (University of Toronto)
E18-304

Abstract: Large language models (LLMs) often concentrate their attention on a small set of tokens—referred to as attention sinks. Common examples include the first token, a prompt-independent sink, and punctuation tokens, which are prompt-dependent. Although these tokens often lack inherent semantic meaning, their presence is critical for model performance, particularly under model compression and KV-caching. Yet, the function, semantic role, and origin of attention sinks—especially those beyond the first token—remain poorly understood. In this talk, I’ll present a comprehensive investigation…

Find out more »

Back to the future – data efficient language modeling

Tatsunori Hashimoto (Stanford University)
E18-304

Abstract: Compute scaling has dominated the conversation with modern language models, leading to an impressive array of algorithms that optimize performance for a given training (and sometimes inference) compute budget. But as compute has grown cheaper and more abundant, data is starting to become a bottleneck, and our ability to exchange computing for data efficiency may be crucial to future model scaling. In this talk, I will discuss some of our recent work on synthetic data and algorithmic approaches to…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764