Causal Inference and Propensity Score Methods (WK87)
(ONLINE COURSE)
In this course students gain an overview on the statistical techniques and research designs used by epidemiologists to estimate treatment effects from patient data and learn how to apply these techniques.
Course details
Date: 
13, 14, 17, 18 January 2022 (online course)

Tuition fee: 
€1.000 

City:  Online in 2022  Course coordinator:  L.T. (Thomas) Klausch, PhD  
Language:  English  Learning method:  Lectures and Practicals  
Examination:  Written Exam  Examination dates:  See page Exams  
Number of EC:  2  Details: 
Date  Tuition fee:  

13, 14, 17, 18 January 2022 (online course)

€1.000 

City:  Online in 2022  
Course coordinator:  L.T. (Thomas) Klausch, PhD  
Language:  English  
Learning method:  Lectures and Practicals  
Examination:  Written Exam  
Examination dates:  See page Exams  
Number of EC:  2  
Details: 
About the course
Because of the unpredictable situation regarding Covid, we have decided to offer the course online in January 2022. This is because there is a lot of interest from foreign students and it is not yet known whether students will be able to travel freely.
If a course is [Full], you can still register, but you will be placed on a waiting list. We will contact you as soon as a place becomes available. At that time you can still decide whether you want to participate in the course.
More information
In this course students gain an overview on the statistical techniques and research designs used by epidemiologists to estimate treatment effects from patient data and learn how to apply these techniques.
We begin by introducing the NeymanRubin causal model (RCM), also called the potential outcomes framework, which postulates that each patient has as many potential outcomes as there are treatment options. Within this framework it is possible to define certain types of treatment effects that we often want to estimate in epidemiological studies, such as average causal effects. It is then explained how completely randomized experiments (randomized controlled trials – RCTs) can be used to estimate these causal effects. Students gain insight into how in an RCT design with a treatment and a control group, only one potential outcome per patient is observed, while the other potential outcome is missing. We will then see that due to an ingenious property of RCTs, called exchangeability, simple mean differences between treatment and control groups are equal to average causal effects. Students gain awareness that this property is the central reason for the pivotal importance of RCTs for estimating causal effects.
Subsequently, we will consider estimation of treatment effects in settings when treatment assignment cannot be randomized by the experimenter. This setting is referred to as an observational design and emerges in epidemiology usually in the form of crosssectional retrospective as well as prospective studies. The key difference compared to RCTs is that exchangeability is not known to hold in general. However, sometimes the reasons for treatment assignment have been observed (in the data) in the form of socalled confounding variables. When this is true socalled conditional exchangeability holds. The causal inference literature then offers an immense spectrum of statistical techniques for validly estimating treatment effects even outside of RCTs. We go on by studying and applying a core set of these estimation techniques.
We start by covariatebased regression adjustment which students have encountered in earlier statistics classes. We focus on the assumptions and circumstances under which this technique has good performance for estimating causal effects (e.g. linearity and covariate balance). Then we consider examples where linear regression miserably fails to correctly estimate causal treatment effects. Students gain awareness how dangerous it can be to blindly use regression adjustment for treatment effect estimation. As alternative to regression adjustment we then consider alternative estimation techniques with a main focus on the important class of propensity score adjustment techniques. Here we devote significant attention to propensity score weighting, stratification, and matching techniques for estimating treatment effects. These techniques share the advantage that the relationship between confounders and outcome variables does not need to be known or modeled correctly. Instead, the relationship with treatment assignment is modeled and small errors in model specification are alleviated by matching or stratification. All propensity score techniques are thoroughly practiced in order to enable students to apply them on own research problems. We also look at the socalled covariate overlap and covariate balancing assumptions and how to assess them. After considering propensity score techniques we briefly move on to socalled double robust estimation of treatment effects. These estimators combine propensity score weighting with regression adjustment and in many settings can give researchers the best of both worlds.
It is the goal of this course to enable students to use all techniques but also be aware of the underlying assumptions that allow their use or forbid it. In addition, we summarize and discuss the implications for planning and designing epidemiological research. It is often not known that observational studies require a great amount of preparation. An important framework for doing so is the ‘Target Trial Framework’ which is discussed and practiced to enable students to judge the quality of observational studies and design their own.
The course consists of 4 days.
Topics day 1:
 NeymanRubin causal model (RCM) and potential outcomes
 Definitions of causal effects
 Causal effects vs. associational effects
 Randomized experiments and conditionally randomized experiments
 Exchangeability and conditional exchangeability
 Standardization estimator
 Inverse probability estimator
 Study groups (afternoon)
Topics day 2:
 Observational designs
 Identifying assumptions to estimate causal effects from observational designs
 Regression estimation to estimate causal effects
 Weaknesses of regression estimation for estimating causal effects
 Propensity score theory
 Estimating propensity scores
 Assessing overlap and balance using the propensity score
 Inverse propensity score weighting
 Computer tutorials in R (afternoon)
Topics day 3:
 Propensity score stratification
 Marginal mean weighting through stratification
 Conditional treatment effects
 Simple nearest neighbor propensity score matching
 Computer tutorial in R (afternoon)
Topics day 4:
 Design of observational studies
 The target trial framework
 Extensions of nearest neighbor matching
 Advanced matching methods
 Combination of regression adjustment, matching, and stratification
 Double robust estimation methods
 Study groups and computer tutorial in R (afternoon)
 The student can define and analyze causal questions using the NeymanRubin Causal Model in fully (conditionally) randomized experiments and observational designs, be able to carry out basic calculations with potential outcomes, and calculate causal effects using inverse probability weighting and standardization.
 The student can apply the Target Trial Framework to assess the quality of observational designs and explain the conditions under which causal effects are identifiable in an observational design.
 The student can apply regression estimation techniques to estimate average causal effects in R and explain the weaknesses of regression modeling for causal effect estimation.
 The student can estimate propensity scores models in R and assess covariate overlap and balance.
 The student can apply propensity score weighting, stratification, marginal mean weighting through stratification, and matching in R to estimate causal effects and know some advanced matching algorithms.
 The student can explain why it is useful to combine propensity score weighting, stratification, matching and regression analysis to estimate causal effects
Target group
Target group: epidemiologists concerned with estimation of treatments effects between study groups using observational data.
Course prerequisites
Participants are expected:
 to have basic knowledge of statistics and knowledge equivalent to EpidM course ‘Regression Techniques (V30)’. This is expected to include the following knowledge:
 Probabilities, conditional probabilities, contingency tables
 Expectations, conditional expectations and variances
 Statistical dependence and independence, statistical estimation and testing, confidence intervals
 Linear regression and logistic regression modeling.
 to have basic knowledge of the statistical programming language R and the environment RStudio. This knowledge is expected to include at least the topics covered in the online book ‘Introduction to R and RStudio’ (https://bookdown.org/introrbook/intro2r).
A good preparation is the EpidM course ‘Statistische Analyses met R (R07)’.
Coursematerial
The course materials (lectures, assignments, feedback of the assignments etc) are available on Canvas, our digital learning environment. The documents will remain available on Canvas for at least one year.
To be able to do the computer practicals of this course you will need R and R studio.
R and R studio can be downloaded for free from the internet. https://cran.rproject.org/
Literature
Recommended reading
Bast, R. & Heymans, M.W. (2020). Introduction to R and RStudio [Online]. Available under: https://bookdown.org/introrbook/intro2r
Hernan, M. A., & Robins, J. M. (2020). Causal Inference: What If? CRC Press [Online]. Available under:
https://cdn1.sph.harvard.edu/wpcontent/uploads/sites/1268/2021/01/ciwhatif_hernanrobins_31dec20.pdf
Ho, D., Imai, K., King, G., & Stuart, E. A. (2011). MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Journal of Statistical Software, 42(1), 1–28. https://doi.org/10.18637/jss.v042.i08
Imbens, G. W., & Rubin, D. B. (2015). Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press. https://doi.org/10.1017/CBO9781139025751
Leite, W. L. (2017). Practical Propensity Score Methods Using R (Illustrated Edition). SAGE Publications, Inc.
Schafer, J. L., & Kang, J. D. Y. (2008). Average causal effects from nonrandomized studies: A practical guide and simulated example. Psychological Methods, 13(4), 279–313. https://doi.org/10.1037/a0014268
Students participating in the course as part of the Master’s programme Epidemiology need to pass the exam in order to complete the course.
Students not participating in Master’s programme Epidemiology who sign up for this course as a separate / single course can optionally register for the exam. The examination fee is € 150 per registration.
You can register for the exam via the website: Exams. Registration will close 3 weeks prior to the exam.
Please note that you need to pass the exam in order to receive credits (EC).
A certificate of participation will be granted to all students who have attended at least 80% of the classes. Only contact hours are stated on this certificate.
Only for Dutch medical specialists!
If you wish to be considered for accreditation points by the KNMG , you must sign the attendance list on the last day of the course.
To qualify for the accreditation points, you must have been present the whole course.
Faculty
L.T. (Thomas) Klausch, PhD
Epidemiology and Data Science, Amsterdam UMC