BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//IDSS - ECPv6.0.5//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:IDSS
X-ORIGINAL-URL:https://idss.mit.edu
X-WR-CALDESC:Events for IDSS
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20220313T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20221106T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20221209T110000
DTEND;TZID=America/New_York:20221209T120000
DTSTAMP:20221206T065157
CREATED:20220720T170108Z
LAST-MODIFIED:20221130T142056Z
UID:16755-1670583600-1670587200@idss.mit.edu
SUMMARY:High-dimensional limit theorems for Stochastic Gradient Descent: effective dynamics and critical scaling
DESCRIPTION:Abstract:\n\n\n\n\nWe study the scaling limits of stochastic gradient descent (SGD) with constant step-size in the high-dimensional regime. We prove limit theorems for the trajectories of summary statistics (i.e.\, finite-dimensional functions) of SGD as the dimension goes to infinity. Our approach allows one to choose the summary statistics that are tracked\, the initialization\, and the step-size. It yields both ballistic (ODE) and diffusive (SDE) limits\, with the limit depending dramatically on the former choices. Interestingly\, we find a critical scaling regime for the step-size below which the effective ballistic dynamics matches gradient flow for the population loss\, but at which\, a new correction term appears which changes the phase diagram. About the fixed points of this effective dynamics\, the corresponding diffusive limits can be quite complex and even degenerate. We demonstrate our approach on popular examples including estimation for spiked matrix and tensor models and classification via two-layer networks for binary and XOR-type Gaussian mixture models. These examples exhibit surprising phenomena including multimodal timescales to convergence as well as convergence to sub-optimal solutions with probability bounded away from zero from random (e.g.\,Gaussian) initializations. \nThis is a joint work with Reza Gheissari (Northwestern) and Aukosh Jagannath (Waterloo)\, to appear in NeurIPS 2022 (Arxiv2206.04030) \nBio:\nA specialist of probability theory and its applications\, Gérard Ben Arous arrived to NYU’s Courant Institute as a Professor of Mathematics in 2002. He was appointed Director of the Courant Institute and Vice Provost for Science and Engineering Development in September 2011. A native of France\, Professor Ben Arous studied Mathematics at École Normale Supérieure and earned his PhD from the University of Paris VII (1981). He has been a Professor at the University of Paris-Sud (Orsay)\, at École Normale Supérieure\, and more recently at the Swiss Federal Institute of Technology in Lausanne\, where he held the Chair of Stochastic Modeling. He headed the department of Mathematics at Orsay and the departments of Mathematics and Computer Science at École Normale Supérieure.
URL:https://idss.mit.edu/calendar/tbd-36/
LOCATION:E18-304\, United States
CATEGORIES:Stochastics and Statistics Seminar Series
END:VEVENT
END:VCALENDAR