Loading Events

Past Events › IDSS Distinguished Seminar Series

A monthly lecture series featuring prominent global leaders and academics sharing research in areas that are impacted by the emergence of big data.

Events Search and Views Navigation

Event Views Navigation

December 2019

Automating the Digitization of Historical Data on a Large Scale

December 2, 2019 @ 4:00 pm - 5:00 pm

Melissa Dell (Harvard University)

E18-304

https://youtu.be/mnM7ePr6xqM Over the past two centuries, we have transitioned from an overwhelmingly agricultural world to one with vastly different patterns of economic organization. This transition has been remarkably uneven across space and time, and has important implications for some of the most central challenges facing societies today. Deepening our understanding of the determinants of economic transformation requires data on the long-run trajectories of individuals and firms. However, these data overwhelmingly remain trapped in hard copy, with cost estimates for manual…

Find out more »

November 2019

Causal Inference in the Age of Big Data

November 4, 2019 @ 4:00 pm - 5:00 pm

Jasjeet Sekhon (UC Berkeley)

E18-304

The rise of massive data sets that provide fine-grained information about human beings and their behavior offers unprecedented opportunities for evaluating the effectiveness of social, behavioral, and medical treatments. With the availability of fine-grained data, researchers and policymakers are increasingly unsatisfied with estimates of average treatment effects based on experimental samples that are unrepresentative of populations of interest. Instead, they seek to target treatments to particular populations and subgroups. Because of these inferential challenges, Machine Learning (ML) is now being…

Find out more »

October 2019

Theoretical Foundations of Active Machine Learning

October 7, 2019 @ 4:00 pm - 5:00 pm

Rob Nowak (University of Wisconsin - Madison)

E18-304

Title: Theoretical Foundations of Active Machine Learning Abstract: The field of Machine Learning (ML) has advanced considerably in recent years, but mostly in well-defined domains using huge amounts of human-labeled training data. Machines can recognize objects in images and translate text, but they must be trained with more images and text than a person can see in nearly a lifetime.  The computational complexity of training has been offset by recent technological advances, but the cost of training data is measured in…

Find out more »

September 2019

Selection and Endogenous Bias in Studies of Health Behaviors

September 30, 2019 @ 4:00 pm - 5:00 pm

Emily Oster (Brown University)

E18-304

Abstract: Studies of health behaviors using observational data are prone to bias from selection in behavior choices. How important are these biases? Are they dynamic - that is, are they influenced by the recommendations we make? Are there formal assumptions under which we can use information we have about selection on observed variables to learn about the possible bias from unobserved selection? About the Speaker: Emily Oster is a professor of economics. Prior to coming to Brown she was an…

Find out more »

May 2019

Design and Analysis of Two-Stage Randomized Experiments

May 7, 2019 @ 4:00 pm - 5:00 pm

Kosuke Imai (Harvard University)

E18-304

Abstract: In many social science experiments, subjects often interact with each other and as a result, one unit's treatment can influence the outcome of another unit. Over the last decade, a significant progress has been made towards causal inference in the presence of such interference between units. In this talk, we will discuss two-stage randomized experiments, which enable the identification of the average spillover effects as well as that of the average direct effect of one's own treatment. In particular,…

Find out more »

April 2019

A Particulate Solution: Data Science in the Fight to Stop Air Pollution and Climate Change | IDSS Distinguished Speaker Seminar

April 2, 2019 @ 4:00 pm - 5:00 pm

Francesca Dominici (Harvard University)

E18-304

Abstract: What if I told you I had evidence of a serious threat to American national security – a terrorist attack in which a jumbo jet will be hijacked and crashed every 12 days. Thousands will continue to die unless we act now. This is the question before us today – but the threat doesn’t come from terrorists. The threat comes from climate change and air pollution. We have developed an artificial neural network model that uses on-the-ground air-monitoring data…

Find out more »

March 2019

A Theory for Representation Learning via Contrastive Objectives

March 5, 2019 @ 4:00 pm - 5:00 pm

Sanjeev Arora (Princeton University)

32-155

Abstract: Representation learning seeks to represent complicated data (images, text etc.) using a vector embedding, which can then be used to solve complicated new classification tasks using simple methods like a linear classifier. Learning such embeddings is an important type of unsupervised learning (learning from unlabeled data) today. Several recent methods leverage pairs of "semantically similar" data points (eg sentences occuring next to each other in a text corpus). We call such methods contrastive learning (another term would be "like…

Find out more »

February 2019

Collective Decision Making: Theory and Experiments

February 5, 2019 @ 4:00 pm - 5:00 pm

Leeat Yariv (Princeton University)

32-155

Abstract: Ranging from jury decisions to political elections, situations in which groups of individuals determine a collective outcome are ubiquitous. There are two important observations that pertain to almost all collective processes observed in reality. First, decisions are commonly preceded by some form of communication among individual decision makers, such as jury deliberations, or election polls. Second, even when looking at a particular context, say U.S. civil jurisdiction, there is great variance in the type of institutions that are employed…

Find out more »

December 2018

The Opportunity Atlas: Mapping the Childhood Roots of Social Mobility

December 3, 2018 @ 4:00 pm - 5:00 pm

Raj Chetty (Harvard University)

32-155

Abstract: We construct a publicly available atlas of children’s outcomes in adulthood by Census tract using anonymized longitudinal data covering nearly the entire U.S. population. For each tract, we estimate children’s earnings distributions, incarceration rates, and other outcomes in adulthood by parental income, race, and gender. These estimates allow us to trace the roots of outcomes such as poverty and incarceration back to the neighborhoods in which children grew up. We find that children’s outcomes vary sharply across nearby areas: for children of parents at…

Find out more »

November 2018

The Regression Discontinuity Design: Methods and Applications

November 5, 2018 @ 4:00 pm - 5:00 pm

Rocio Titiunik (University of Michigan)

E18-304

Abstract: The Regression Discontinuity (RD) design is one of the most widely used non-experimental strategies for the study of treatment effects in the social, behavioral, biomedical, and statistical sciences. In this design, units are assigned a score and a treatment is offered if the value of that score exceeds a known threshold---and withheld otherwise. In this talk, I will discuss the assumptions under which the RD design can be used to learn about treatment effects, and how to make valid…

Find out more »
+ Export Events

© MIT Institute for Data, Systems, and Society | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 | Design by Opus