Skip to main content

Clinical Prediction Models and
Machine Learning (WK80)
(ONLINE COURSE)

The purpose of a prediction model is to estimate the probability of the presence of a particular outcome as accurately as possible. Prediction models are often developed with clinical practice in mind, and involve combining information about individual patients to calculate an individual’s probability of illness or recovery. The model can then be presented in the form of a clinical predictive rule. General applicability – i.e. the accuracy of the prediction model when applied to new patients in the future – is another very important aspect.

Course details

Date:
27, 28, 31 January, 1 February 2022 (online course)
Tuition fee:
1.000
City: Online in 2022 Course coordinator: M.W. (Martijn) Heymans, PhD
Language: English Learning method: Lectures and computerpractical
Examination: Written exam with computer assignments (facultative) Examination dates: See page Exams
Number of EC: 2 Details:
Date Tuition fee:
27, 28, 31 January, 1 February 2022 (online course)
1.000
City: Online in 2022
Course coordinator: M.W. (Martijn) Heymans, PhD
Language: English
Learning method: Lectures and computerpractical
Examination: Written exam with computer assignments (facultative)
Examination dates: See page Exams
Number of EC: 2
Details:

About the course

Clinical Prediction Models and Machine Learning is a 4-days course. he course consists of an intensive programme of partly interactive lectures, combined with computer-based practical work. Examples taken from clinical practice will be used for the computer-based work.

Because of the unpredictable situation regarding Covid, we have decided to offer the course online in January 2022. This is because there is a lot of interest from foreign students and it is not yet known whether students will be able to travel freely.

If a course is [Full], you can still register, but you will be placed on a waiting list. We will contact you as soon as a place becomes available. At that time you can still decide whether you want to participate in the course.

More information

The purpose of a prediction model is to estimate the probability of the presence of a particular outcome as accurately as possible. Prediction models are often developed with clinical practice in mind, and involve combining information about individual patients to calculate an individual’s probability of illness or recovery. The model can then be presented in the form of a clinical predictive rule. General applicability – i.e. the accuracy of the prediction model when applied to new patients in the future – is another very important aspect.

Nowadays, access to data is becoming easier and easier and therefore the data sets are getting bigger and bigger. The problem when developing prediction models in these data sets include the difficulty of selecting the most important predictors from a large number of variables. If this is not done carefully, the quality of the prediction model can be adversely affected. Machine learning methods can be used to develop prediction models in these large data sets. Also, prediction models may be adjusted before they can be applied to new persons. All these issues are frequently overlooked or underestimated by clinicians and researchers.

The aim of the course is to provide better knowledge and understanding of the development of prediction models in smaller and larger data sets that are relevant to real-life practice. We will focus on common methods for selecting variables as backward selection but also more advanced Machine learning procedures as lasso regression and tree based methods as well as their pros and cons. Once prediction models have been developed, it is important to assess the quality of the prediction model. For example, we will look at whether the predictions of the model are accurate and will consider various ways of measuring performance by using measures for overall quality, discrimination and calibration. The question of applying the model to new (future) patients will also be addressed. An important element of this is investigating whether the performance of the prediction model deteriorates when it is applied to new patients. This component is entitled the validation of the prediction model and we will cover various techniques for internal and external validation of prediction models and ways to train and test the model by using bootstrapping and cross-validation.

The course consists of an intensive programme of partly interactive lectures, combined with computer-based practical work. Examples taken from clinical practice will be used for the computer-based work.

Day 1

The development and quality of prediction models, including:

  • The characteristics of a prediction model
  • The most frequently used methods for selecting variables
  • The pros and cons of common methods for selecting variables
  • Sample size recommendations to develop a prediction model
  • Different measures of quality and how to interpret them (including explained variation, calibration, discrimination, ROC curve)
  • Introduction to Spline regression models.
  • Introduction R software

Day 2

Introduction to the validation of prediction models

  • The linear predictor (lp)
  • Optimism and shrinkage
  • Adjusting the intercept
  • The internal and external validation of prediction models
  • Train and test datasets (Bootstrapping and Cross-validation)
  • Adjusting the slope
  • External validation
  • Generalizability of prediction model (Case-mix, different regression coefficients)
  • Presentation formats of prediction models

Day 3

Updating of prediction models

  • Reasons for generalizability problems
  • Updating the intercept and slope
  • Comparing Prediction models
  • Adding a new Updating of Prediction models variable
  • Reclassification tables
  • A prediction model for survival data

Day 4

Developing prediction model with many variables

  • Developing prediction model with many variables
  • Cross-validation
  • Lasso Regression
  • Model stability analyses
  • Tree based methods
  • Random forest
  1. The participant can recognize and identify the characteristics of a prediction model.
  2. The participant can identify the weak and strong points of the most commonly used methods for selecting variables.
  3. The participant can identify the weak and strong points of advanced Machine learning methods for selecting variables as Lasso regression and tree based methods.
  4. The participant can analyse and interpret the methods that are used to determine the quality of a prediction model (including tools for overall performance as the Brier score and explained variance, discrimination such as the ROC curve, and for calibration such as the Hosmer and Lemeshow test and a calibration curve).
  5. The participant can analyse and interpret the methods that are used to determine the value of a prediction model for real-life practice (sensitivity, specificity).
  6. The participant is familiar with the principles that play a role in internal validation such as over-fitting, optimism and shrinkage.
  7. The participant can analyse and interpret methods used for validation of prediction models, such as cross-validation and bootstrapping techniques.
  8. The participant can use methods to update the intercept and slope of the prediction model.
  9. The participant is able to study the added value of a new predictor variable by using reclassification tables by making use of SPSS and R software.
  10. The participant is able to develop a prediction model and to study the quality of the model by using logistic regression models, Machine learning methods as Lasso regression and Tree based methods and in survival data by using a Cox regression model.
  11. The participant can develop prediction models, assess their quality and validate them (internally and externally) using SPSS and R software.
Target group

The course is designed for PhD-students, practitioners and applied researchers working in the field of epidemiology, medicine, public health, psychology, human movement sciences.

The course is intended for anyone who wants to know more about prediction models, for example because they want to be able to assess a research proposal or article better or because they are developing or wanting to make a prediction model themselves. It is also important to be able to make a proper assessment of the value of a prediction model for practice.

Course pre-requisites

The following concepts are assumed known by participants at the start of this course:

– Knowledge of basic statistical tests as t-tests and regression analyses.

– Knowledge of some basic SPSS commands. Knowledge of R(Studio) is not a prerequisite.

Coursematerial

The course materials (lectures, assignments, feedback of the assignments etc) are available on Canvas, our digital learning environment. The documents will remain available on Canvas for at least one year.

To be able to do the computer practicals of this course you will need:

1. R and R studio, R and R studio can be downloaded for free from the internet. https://cran.r-project.org/

and

2. SPSS, if you don’t have SPSS on your laptop, you can purchase SPSS through Surfspot at a very reasonable price. If you do not want to purchase SPSS, you can use the trial version that IBM makes available. See SPSS Software | IBM

Literature

Literature will be provided during the course

Students participating in the course as part of the Master’s programme Epidemiology need to pass the exam in order to complete the course.

Students not participating in Master’s programme Epidemiology who sign up for this course as a separate / single course can optionally register for the exam. The examination fee is € 150 per registration.

You can register for the exam via the website: Exams. Registration will close 3 weeks prior to the exam.

Please note that you need to pass the exam in order to receive credits (EC).

A certificate of participation will be granted to all students who have attended at least 80% of the classes. Only contact hours are stated on this certificate.

Only for Dutch medical specialists!

If you wish to be considered for accreditation points by the KNMG , you must sign the attendance list on the last day of the course.

To qualify for the accreditation points, you must have been present the whole course.

Faculty

L.T. (Thomas) Klausch, PhD

Epidemiology and Data Science, Amsterdam UMC