## September 2019

## Data Science and Big Data Analytics: Making Data-Driven Decisions

online

Developed by 11 MIT faculty members at IDSS, this seven-week course is specially designed for data scientists, business analysts, engineers and technical managers looking to learn strategies to harness data. Offered by MIT xPRO. Course begins September 30, 2019.

Find out more »## Frontiers of Efficient Neural-Network Learnability

Adam Klivans (University of Texas at Austin)

E18-304

Abstract: What are the most expressive classes of neural networks that can be learned, provably, in polynomial-time in a distribution-free setting? In this talk we give the first efficient algorithm for learning neural networks with two nonlinear layers using tools for solving isotonic regression, a nonconvex (but tractable) optimization problem. If we further assume the distribution is symmetric, we obtain the first efficient algorithm for recovering the parameters of a one-layer convolutional network. These results implicitly make use of a…

Find out more »## Power of Experimental Design and Active Learning

Aarti Singh (Carnegie Mellon University)

E18-304

Classical supervised machine learning algorithms focus on the setting where the algorithm has access to a fixed labeled dataset obtained prior to any analysis. In most applications, however, we have control over the data collection process such as which image labels to obtain, which drug-gene interactions to record, which network routes to probe, which movies to rate, etc. Furthermore, most applications face budget limitations on the amount of labels that can be collected. Experimental design and active learning are two…

Find out more »## Some New Insights On Transfer Learning

Samory Kpotufe (Columbia University)

E18-304

Abstract: The problem of transfer and domain adaptation is ubiquitous in machine learning and concerns situations where predictive technologies, trained on a given source dataset, have to be transferred to a new target domain that is somewhat related. For example, transferring voice recognition trained on American English accents to apply to Scottish accents, with minimal retraining. A first challenge is to understand how to properly model the ‘distance’ between source and target domains, viewed as probability distributions over a feature…

Find out more »## Probabilistic Modeling meets Deep Learning using TensorFlow Probability

Brian Patton (Google AI)

E18-304

IDS.190 - Topics in Bayesian Modeling and Computation Speaker: Brian Patton (Google AI) Abstract: TensorFlow Probability provides a toolkit to enable researchers and practitioners to integrate uncertainty with gradient-based deep learning on modern accelerators. In this talk we'll walk through some practical problems addressed using TFP; discuss the high-level interfaces, goals, and principles of the library; and touch on some recent innovations in describing probabilistic graphical models. Time-permitting, we may touch on a couple areas of research interest for the…

Find out more »## Dynamic Monitoring and Decision Systems (DyMonDS) Framework for Data-Enabled Integration in Complex Electric Energy Systems

Marija Ilic (MIT)

32-155

In this talk, we introduce a unifying Dynamic Monitoring and Decision Systems (DyMonDS) framework that is based on multi-layered modeling for aggregation and minimal coordination of interactions between the layers of complex electric energy systems. Using this approach, distributed control and optimization problems are formulated so that: (1) the low-level decision-makers optimize cost of local interactions while accounting for their heterogeneous technologies, as well as for their social and risk preferences; and, (2) the higher layer aggregators and coordinators optimize…

Find out more »## Automated Data Summarization for Scalability in Bayesian Inference

Tamara Broderick (MIT)

E18-304

IDS.190 - Topics in Bayesian Modeling and Computation Abstract: Many algorithms take prohibitively long to run on modern, large datasets. But even in complex data sets, many data points may be at least partially redundant for some task of interest. So one might instead construct and use a weighted subset of the data (called a "coreset") that is much smaller than the original dataset. Typically running algorithms on a much smaller data set will take much less computing time, but…

Find out more »## GANs, Optimal Transport, and Implicit Density Estimation

Tengyuan Liang (University of Chicago)

E18-304

Abstract: We first study the rate of convergence for learning distributions with the adversarial framework and Generative Adversarial Networks (GANs), which subsumes Wasserstein, Sobolev, and MMD GANs as special cases. We study a wide range of parametric and nonparametric target distributions, under a collection of objective evaluation metrics. On the nonparametric end, we investigate the minimax optimal rates and fundamental difficulty of the implicit density estimation under the adversarial framework. On the parametric end, we establish a theory for general…

Find out more »## May 2019

## Learning for Dynamics and Control (L4DC)

32-123

Over the next decade, the biggest generator of data is expected to be devices which sense and control the physical world. This explosion of real-time data that is emerging from the physical world requires a rapprochement of areas such as machine learning, control theory, and optimization. While control theory has been firmly rooted in tradition of model-based design, the availability and scale of data (both temporal and spatial) will require rethinking of the foundations of our discipline. From a machine…

Find out more »## Conference on Synthetic Controls and Related Methods

E18-304

Organizers are Alberto Abadie (MIT), Victor Chernozhukov (MIT), and Guido Imbens (Stanford University). The program is posted here. Participation by invitation only.

Find out more »