Views Navigation

Event Views Navigation

Massive Models in Low Precision: Power, Limits, and Scaling Laws

Dan Alistarh (ISTA)
E18-304

Abstract: Modern large language models have billions to trillions of parameters, creating enormous computational and memory costs. Quantization, i.e. reducing their numerical precision, is the leading practical mitigation strategy. But how far can we push it, and what do we lose? This talk addresses different sides of this question. First, for post-training quantization, we characterize the accuracy–compression frontier focusing on large-scale evaluations and new formats. Second, for quantization-aware training, we show that convergence behavior is predicted by representation scaling laws,…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764