When do spectral gradient updates help in deep learning?
February 20, 2026 @ 11:00 am - 12:00 pm
Dmitriy Drusvyatskiy (University of California, San Diego)
E18-304
Abstract: Spectral gradient methods, such as the recently proposed Muon optimizer, are a promising alternative to standard gradient descent for training deep neural networks and transformers. Yet, it remains unclear in which regimes these spectral methods are expected to perform better. In this talk, I will present a simple condition that predicts when a spectral update yields a larger decrease in the loss than a standard gradient step. Informally, this criterion holds when, on the one hand, the gradient of the loss with respect to each parameter block has a nearly uniform spectrum—measured by its nuclear-to-Frobenius ratio—while, on the other hand, the incoming activation matrix has low stable rank. It is this mismatch in the spectral behavior of the gradient and the propagated data that underlies the advantage of spectral updates. Reassuringly, this condition naturally arises in a variety of settings, including random feature models, neural networks, and transformer architectures. I will conclude by showing that these predictions align with empirical results in synthetic regression problems and in small-scale language model training.
Bio: Dmitriy Drusvyatskiy received his PhD from Cornell University in 2013, followed by a post doctoral appointment at University of Waterloo, 2013-2014. He joined the Mathematics department at University of Washington as an Assistant Professor in 2014 and was promoted to Full Professor in 2022. Since 2025, Dmitriy is a Professor at the Halıcıoğlu Data Science Institute (HDSI) at UC San Diego. Dmitriy’s research broadly focuses on designing and analyzing algorithms for large-scale optimization problems, primarily motivated by applications in data science. Dmitriy has received a number of awards, including the Air Force Office of Scientific Research (AFOSR) Young Investigator Program (YIP) Award, NSF CAREER, SIAG/OPT Best Paper Prize 2023, Paul Tseng Faculty fellowship 2022-2026, INFORMS Optimization Society Young Researcher Prize 2019, and finalist citations for the Tucker Prize 2015 and the Young Researcher Best Paper Prize at ICCOPT 2019.



