Publications
Publications
- Harvard Data Science Review
The Importance of Being Causal
By: Iavor I Bojinov, Albert Chen and Min Liu
Abstract
Causal inference is the study of how actions, interventions, or treatments affect outcomes of interest. The methods that have received the lion’s share of attention in the data science literature for establishing causation are variations of randomized experiments. Unfortunately, randomized experiments are not always feasible for a variety of reasons, such as an inability to fully control the treatment assignment, high cost, and potential negative impacts. In such settings, statisticians and econometricians have developed methods for extracting causal estimates from observational (i.e., nonexperimental) data. Data scientists’ adoption of observational study methods for causal inference, however, has been rather slow and concentrated on a few specific applications. In this article, we attempt to catalyze interest in this area by providing case studies of how data scientists used observational studies to deliver valuable insights at LinkedIn. These case studies employ a variety of methods, and we highlight some themes and practical considerations. Drawing on our learnings, we then explain how firms can develop an organizational culture that embraces causal inference by investing in three key components: education, automation, and certification.
Keywords
Causal Inference; Observational Studies; Cross-sectional Studies; Panel Studies; Interrupted Time-series; Instrumental Variables
Citation
Bojinov, Iavor I., Albert Chen, and Min Liu. "The Importance of Being Causal." Harvard Data Science Review 2.3 (July 30, 2020).