Views Navigation

Event Views Navigation

Rotated Mean-Field Variational Inference and Iterative Gaussianization

Sifan Liu (Duke University)
E18-304

Abstract: Mean-field variational inference (MFVI) approximates a target distribution with a product distribution in the standard coordinate system, offering a scalable approach to Bayesian inference but often severely underestimating uncertainty due to neglected dependence. We show that MFVI can be greatly improved when performed along carefully chosen principal component axes rather than the standard coordinates. The principal components are obtained from a cross-covariance matrix of the target’s score function and identify orthogonal directions that capture the dominant discrepancies between the…

Find out more »

Massive Models in Low Precision: Power, Limits, and Scaling Laws

Dan Alistarh (ISTA)
E18-304

Abstract: Modern large language models have billions to trillions of parameters, creating enormous computational and memory costs. Quantization, i.e. reducing their numerical precision, is the leading practical mitigation strategy. But how far can we push it, and what do we lose? This talk addresses different sides of this question. First, for post-training quantization, we characterize the accuracy–compression frontier focusing on large-scale evaluations and new formats. Second, for quantization-aware training, we show that convergence behavior is predicted by representation scaling laws,…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764