Missing data: consequences and solutions (WV81)
Although researchers do their best to avoid missing data, it is a common problem in medical and epidemiological studies. How large the impact is of missing data on the study results and how to solve the missing data problem depends on how much data is missing and why the data are missing. This three-day course provides you with simple and advanced tools how to evaluate and handle missing data in medical and epidemiological studies.
10, 11, 12 January 2022 (online course)
|City:||Online in 2022||Course coordinator:||M.W. (Martijn) Heymans, PhD|
|Language:||English||Learning method:||Lectures and Computerpractical|
|Examination:||Written examination with computer assignments (facultative)||Examination dates:||See page Exams|
|Number of EC:||2||Details:|
10, 11, 12 January 2022 (online course)
|City:||Online in 2022|
|Course coordinator:||M.W. (Martijn) Heymans, PhD|
|Learning method:||Lectures and Computerpractical|
|Examination:||Written examination with computer assignments (facultative)|
|Examination dates:||See page Exams|
|Number of EC:||2|
About the course
Missing data is a 3-days course. The course consists of an intensive programme of partly interactive lectures, combined with computer-based practical work. Examples taken from clinical practice will be used for the computer-based work.
Because of the unpredictable situation regarding Covid, we have decided to offer the course online in January 2022. This is because there is a lot of interest from foreign students and it is not yet known whether students will be able to travel freely.
If a course is [Full], you can still register, but you will be placed on a waiting list. We will contact you as soon as a place becomes available. At that time you can still decide whether you want to participate in the course.
There are various methods that can be used to deal with missing data. Simple solutions are that you ignore the missing values and delete all cases with missing values from the analysis or to use a regression model to estimate the missing values. There are also more advanced methods as Multiple Imputation. Multiple Imputation with the Multivariate Imputation with Chained Equations (MICE) procedure is a promising technique that works well in various missing data situations. With Multiple Imputation several complete datasets are generated. Data analysis has to be done in each dataset and results are pooled using special calculation rules (called Rubin’s rules). These steps will be discussed during the course as well as questions of how to use different missing data methods in medical and epidemiological datasets. Furthermore it is important to check if your imputation strategy was successful (imputation diagnostics) which will also be discussed during the course.
Each course day starts with lectures in the morning followed by computer exercises. During the computer exercises various ways to explore missing data problems as well as the application of simple and more advanced missing data methods as Multiple Imputation will be trained using SPSS and R(Studio) software. During the computer exercises you will work with real epidemiological and medical datasets.
Missing data consequences
- Examples of Missing data in different Epidemiological and Medical research designs.
- The meaning of missing data mechanisms (MCAR, MAR, MNAR).
- Consequences and impact of missing data for statistical analyses.
- Ways to evaluate various missing data situations and mechanisms.
Missing data solutions
- The application of simple missing data methods.
- The theory and practice of Multiple Imputation.
- Data analysis after Multiple Imputation.
- How to evaluate imputation success by using imputation diagnostics
SPSS and R(Studio) software.
- The participant is able to distinguish between different missing data mechanisms called missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR).
- The participant can apply basic evaluation procedures to make a valid assumption about the missing data mechanism.
- The participant understands the working of the most frequently used methods to handle missing data in epidemiological and medical datasets.
- The participant recognizes the strengths and limitations of the most frequently used methods to handle missing data in various missing data situations.
- The participant is able to work with SPSS to investigate missing data and to work with the best missing data methods for various missing data situations.
- The participant is able to use Multiple Imputation by the Multivariate Imputation by Chained Equations (MICE) procedure in SPSS amd R(Studio).
- The participant understands how multiple imputation works and how a multiple imputation model should be specified.
- The participant understands how to handle missing questionnaire data and can comprehend the difference between handling item scores at item level and at total score level.
- The participant understands the practical solutions to handle missing data in Multilevel (and Longitudinal) studies.
- The participant is able to work with SPSS and R(Studio) to handle missing data in questionnaires and in Multilevel (and longitudinal) studies.
The course is designed for PhD-students, practitioners and applied researchers working in the field of epidemiology, medicine, public health, psychology, human movement sciences. The course is designed for everybody who wants to learn about missing data because missing data may be present in your own research and you are going to start with your data analysis or you want to learn how to judge other articles or research grants who report missing data. It is also important to be able to judge the impact of missing data for practice-related research.
The following concepts are assumed known by participants at the start of this course:
– Knowledge of basic statistical tests as t-tests and regression analyses.
– Knowledge of some basic SPSS commands. (Knowledge of R(Studio) is not a prerequisite.
The course materials (lectures, assignments, feedback of the assignments etc) are available on Canvas, our digital learning environment. The documents will remain available on Canvas for at least one year.
To be able to do the computer practicals of this course you will need:
1. R and R studio, R and R studio can be downloaded for free from the internet. https://cran.r-project.org/
2. SPSS, if you don’t have SPSS on your laptop, you can purchase SPSS through Surfspot at a very reasonable price. If you do not want to purchase SPSS, you can use the trial version that IBM makes available. See SPSS Software | IBM
Students participating in the course as part of the Master’s programme Epidemiology need to pass the exam in order to complete the course.
Students not participating in Master’s programme Epidemiology who sign up for this course as a separate / single course can optionally register for the exam. The examination fee is € 150 per registration.
You can register for the exam via the website: Exams Registration will close 3 weeks prior to the exam.
Please note that you need to pass the exam in order to receive credits (EC).
A certificate of participation will be granted to all students who have attended at least 80% of the classes. Only contact hours are stated on this certificate.
Only for Dutch medical specialists!
If you wish to be considered for accreditation points by the KNMG , you must sign the attendance list on the last day of the course.
To qualify for the accreditation points, you must have been present the whole course.