Causal Inference with Big Data

(06 Dec 2021–23 Dec 2021)

Organizing Committee

 

Co-chairs

 

Members

 

Contact Information

General Enquiries: ims(AT)nus.edu.sg
Scientific Aspects Enquiries: loh(AT)stat.wisc.edu

Overview

Causal inference is the study of quantifying whether a treatment, policy, or an intervention, denoted as A, has a causal effect on an outcome interest, denoted as Y. What distinguishes a causal effect of A on Y from an associative effect of A on Y, say by computing the correlation between A and Y, is that under a causal effect, intervening on the treatment A leads to changes in the outcome Y. Hence, a causal effect is a stronger notion of a relationship between A and Y than an associative effect.

A central problem in estimating causal effects is dealing with unmeasured confounding, that is dealing with other potential explanations for the effect of a treatment on the outcome which are not measured in the data. For example, to study the causal effect of education on earnings, all possible factors that may influence both education and earnings must be considered to establish causality. One potential factor of concern is the individual’s environment. For example, if an individual grew up in an affluent neighborhood with high quality schools and many opportunities for employment, the environment may have shaped the individual’s desire to pursue more education and to seek high-paying jobs. As such, intervening on education, say by passing education policies that encourage students to go to college, may have minimal to no effect on future earnings, especially if the student’s neighborhood environment remains unchanged; instead, implementing policies that improve neighborhoods would lead to increases in both education and earnings. Because quantifying one’s living environment is generally difficult, this type of unmeasured confounder is always a major concern in causal inference.

The rise of big data has brought renewed hope in dealing with unmeasured confounding. In particular, big data typically provides both rich and large measurements of each individual’s characteristics, providing a greater opportunity to measure unmeasured confounders. Developing methods that effectively utilize big data to draw causal conclusions is a vigorously growing area of research. Broadly speaking, much of the effort has been devoted to correctly apply machine learning techniques to better parse big data from an optimization and computational standpoint and to produce honest causal conclusions, say in the form of adaptive confidence intervals after fitting complex functionals with machine learning techniques.

Activities

TitleDateAbstract
Tutorial on Regression Tree Methods with Emphasis on Big Data, Missing Values, Propensity Score Estimation, and Causal Inference for Randomized Experiments. Wei-Yin Loh, University of Wisconsin-Madison, USA6–15 December 2021 N/A
Workshop Talks by Invited Speakers16–23 December 2021N/A

Venue

Online

Registration

Click here to register

 

Tutorial on Regression Tree Methods with Emphasis on Big Data, Missing Values, Propensity Score Estimation, and Causal Inference for Randomized Experiments

Speaker: Wei-Yin Loh, University of Wisconsin-Madison, USA

Duration: Six hours

Please register and vote your preference for the date and time for the lectures via the link above by 31 October 2021.

Scroll to Top