Causal Inference with Big Data

(06 Dec 2021–23 Dec 2021)

Organizing Committee






Contact Information

General Enquiries: ims(AT)
Scientific Aspects Enquiries: loh(AT)


Causal inference is the study of quantifying whether a treatment, policy, or an intervention, denoted as A, has a causal effect on an outcome interest, denoted as Y. What distinguishes a causal effect of A on Y from an associative effect of A on Y, say by computing the correlation between A and Y, is that under a causal effect, intervening on the treatment A leads to changes in the outcome Y. Hence, a causal effect is a stronger notion of a relationship between A and Y than an associative effect.

A central problem in estimating causal effects is dealing with unmeasured confounding, that is dealing with other potential explanations for the effect of a treatment on the outcome which are not measured in the data. For example, to study the causal effect of education on earnings, all possible factors that may influence both education and earnings must be considered to establish causality. One potential factor of concern is the individual’s environment. For example, if an individual grew up in an affluent neighborhood with high quality schools and many opportunities for employment, the environment may have shaped the individual’s desire to pursue more education and to seek high-paying jobs. As such, intervening on education, say by passing education policies that encourage students to go to college, may have minimal to no effect on future earnings, especially if the student’s neighborhood environment remains unchanged; instead, implementing policies that improve neighborhoods would lead to increases in both education and earnings. Because quantifying one’s living environment is generally difficult, this type of unmeasured confounder is always a major concern in causal inference.

The rise of big data has brought renewed hope in dealing with unmeasured confounding. In particular, big data typically provides both rich and large measurements of each individual’s characteristics, providing a greater opportunity to measure unmeasured confounders. Developing methods that effectively utilize big data to draw causal conclusions is a vigorously growing area of research. Broadly speaking, much of the effort has been devoted to correctly apply machine learning techniques to better parse big data from an optimization and computational standpoint and to produce honest causal conclusions, say in the form of adaptive confidence intervals after fitting complex functionals with machine learning techniques.


Tutorial on Classification and Regression Trees By Example

Speaker: Wei-Yin Loh, University of Wisconsin-Madison, USA

Duration: Six hours


Tutorial on Classification and Regression Trees By Example6–15 December 2021 View
Public Lecture on Human Flourishing and Causal Inference16 December 2021N/A
Workshop Talks by Invited Speakers16–23 December 2021View






Click here to register

Workshop speakers

  • Ding-Geng Chen (Arizona State University, USA)
  • Peng Ding (University of California, Berkeley, USA)
  • Yen-Tsung Huang (Academia Sinica, Taiwan)
  • Zhichao Jiang (UMass Amherst, USA)
  • Hyunseung Kang (University of Wisconsin–Madison, USA)
  • Edward Kennedy (Carnegie Mellon University, USA)
  • Jialiang Li (National University of Singapore, Singapore)
  • Wei-Yin Loh (University of Wisconsin–Madison, USA)
  • Alex Luedkte (University of Washington, USA)
  • Caleb Miles (Columbia University, USA)
  • Elizabeth Ogburn (Johns Hopkins University, USA)
  • James Robins (Harvard University, USA)
  • Baoluo Sun (National University of Singapore, Singapore)
  • Zhiqiang Tan (Rutgers University, USA)
  • Eric Tchetgen Tchetgen (University of Pennsylvania, USA)
  • Stijin Vansteelandt (Ghent University, Belgium)
  • Miao Wang (Peking University, China)
  • Menggang Yu (University of Wisconsin–Madison, USA)

Watch the video of the public lecture here.


Scroll to Top