The Implicit Geometry of Deep Representations: Insights From Log-Bilinear Softmax Models
Abstract: Training data determines what neural networks can learn—but can we predict the geometry of learned representations directly from data statistics? We present a framework that addresses this question for sufficiently large, well-trained neural networks. The key idea is a coarse but predictive abstraction of such networks as log-bilinear softmax models, whose implicit regularization we can analyze. Within this framework, we show how label imbalance shapes representation geometry and, for language models, how word and context representations organize into structures…



