PatientLevelPrediction R package
An R package that builds and evaluates predictive models for individual patients using observational health data structured in the OMOP Common Data Model. It helps predict whether a patient will experience a specific health outcome based on their past medical history.
At a glance
Use when
Predicting individual patient outcomes from longitudinal observational data; comparing model performance across sites; developing risk prediction tools for clinical decision support
Avoid when
Working with non-OMOP data without mapping capability; lacking technical expertise in R/Python; needing real-time predictions in production clinical systems without additional deployment infrastructure
Inputs
Observational health data in OMOP Common Data Model format; definitions of target and outcome cohorts; time-at-risk window; choice of covariates and machine learning algorithms
Outputs
Trained predictive models; performance metrics (AUC, calibration); visualizations (ROC curves, calibration plots); learning curves; model interpretation outputs; validation results across databases; interactive Shiny reports
How it works
PatientLevelPrediction is an R package designed to develop and validate patient-level prediction models using observational healthcare data mapped to the OMOP Common Data Model. It supports multiple target and outcome cohort definitions, various time-at-risk windows, and a wide range of covariates including drugs, diagnoses, procedures, and comorbidities. The package integrates state-of-the-art machine learning algorithms (e.g., regularized logistic regression, random forest, gradient boosting, neural networks) and allows customization of algorithms, feature engineering, and sampling methods. It enables internal and external validation, generates performance plots (ROC, calibration), learning curves, and supports ensemble and deep learning models via companion packages. A Shiny app facilitates interactive exploration and reporting of results.
- Project
- EHDEN
- Funding
- IMI
- Project status
- Completed 2024
- HTA domains
- Clinical Effectiveness
- Categories
- RWEPredictive Modelling
- Technology
- Non-specific
- Assumptions
- Data is accurately mapped to the OMOP CDM; sufficient patient-level data is available in the observation window; temporal relationships between exposures and outcomes are valid; models are trained on representative data
- Strengths
- Standardized framework across databases; supports multiple ML algorithms; extensible with custom models and features; enables external validation; includes interactive visualization and reporting tools; integrates with OHDSI ecosystem
- Limitations
- Requires data in OMOP CDM format; setup can be complex due to dependencies on R, Python, Java; performance depends on data quality and completeness; not all sampling techniques improve model performance
- Also known as
- PatientLevelPrediction, PLP
Questions this answers
- › Can this tool predict if a patient will develop a certain condition in the future?
- › How well does a prediction model perform across different patient populations?
- › What factors are most important in predicting a patient's risk of an outcome?
- › Can I use my own machine learning algorithm with this tool?
- › How can I visualize the performance of my predictive model?
- › Is it possible to validate a model in multiple healthcare databases?
Related methods
Similar by meaning
Beta record. Generated from the primary source via AI extraction and independent audit, pending final human review.

