A Mathematical Basis for Moravec’s Paradox, and Some Open Problems

Max Simchowitz (Carnegie Mellon University)
E18-304

Abstract: Moravec’s Paradox observes that AI systems have struggled far more with learning physical action than symbolic reasoning. Yet just recently, there has been a tremendous increase in the capability of AI-driven robotic systems, reminiscent  of the early acceleration in language modeling capabilities a few years prior.  Using the lens of control-theoretic stability, this talk will demonstrate an exponential separation between natural regimes for learning in the physical world and in discrete/symbolic settings, thereby providing a mathematical basis for Moravec’s…

Find out more »

When do spectral gradient updates help in deep learning?

Dmitriy Drusvyatskiy (University of California, San Diego)
E18-304

Abstract: Spectral gradient methods, such as the recently proposed Muon optimizer, are a promising alternative to standard gradient descent for training deep neural networks and transformers. Yet, it remains unclear in which regimes these spectral methods are expected to perform better. In this talk, I will present a simple condition that predicts when a spectral update yields a larger decrease in the loss than a standard gradient step. Informally, this criterion holds when, on the one hand, the gradient of the…

Find out more »


MIT Institute for Data, Systems, and Society
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764