PatientLevelPrediction R package

Software Packagevalidated✓ Source-grounded

An R package that builds and evaluates predictive models for individual patients using observational health data structured in the OMOP Common Data Model. It helps predict whether a patient will experience a specific health outcome based on their past medical history.

At a glance

Use when

Predicting individual patient outcomes from longitudinal observational data; comparing model performance across sites; developing risk prediction tools for clinical decision support

Avoid when

Working with non-OMOP data without mapping capability; lacking technical expertise in R/Python; needing real-time predictions in production clinical systems without additional deployment infrastructure

Inputs

Observational health data in OMOP Common Data Model format; definitions of target and outcome cohorts; time-at-risk window; choice of covariates and machine learning algorithms

Outputs

Trained predictive models; performance metrics (AUC, calibration); visualizations (ROC curves, calibration plots); learning curves; model interpretation outputs; validation results across databases; interactive Shiny reports

How it works

PatientLevelPrediction is an R package designed to develop and validate patient-level prediction models using observational healthcare data mapped to the OMOP Common Data Model. It supports multiple target and outcome cohort definitions, various time-at-risk windows, and a wide range of covariates including drugs, diagnoses, procedures, and comorbidities. The package integrates state-of-the-art machine learning algorithms (e.g., regularized logistic regression, random forest, gradient boosting, neural networks) and allows customization of algorithms, feature engineering, and sampling methods. It enables internal and external validation, generates performance plots (ROC, calibration), learning curves, and supports ensemble and deep learning models via companion packages. A Shiny app facilitates interactive exploration and reporting of results.

Project: EHDEN
Funding: IMI
Project status: Completed 2024
HTA domains: Clinical Effectiveness
Categories: RWE Predictive Modelling
Technology: Non-specific
Assumptions: Data is accurately mapped to the OMOP CDM; sufficient patient-level data is available in the observation window; temporal relationships between exposures and outcomes are valid; models are trained on representative data
Strengths: Standardized framework across databases; supports multiple ML algorithms; extensible with custom models and features; enables external validation; includes interactive visualization and reporting tools; integrates with OHDSI ecosystem
Limitations: Requires data in OMOP CDM format; setup can be complex due to dependencies on R, Python, Java; performance depends on data quality and completeness; not all sampling techniques improve model performance
Also known as: PatientLevelPrediction, PLP

Questions this answers

Related methods

Similar by meaning

Beta record. Generated from the primary source via AI extraction and independent audit, pending final human review.