AI/ML for Treatment Effect Prediction

Methodpeer-reviewed✓ Source-grounded

This method uses machine learning to predict how likely patients with a rare blood cancer (T-cell prolymphocytic leukemia) are to develop a serious complication (acute graft-versus-host disease) after a stem cell transplant. Because the disease is rare, there's not much data available, which makes predictions harder. The study tested different machine learning models and found that Linear Discriminant Analysis worked best for predicting whether the complication would occur, though it struggled when trying to predict how severe the complication would be.

At a glance

Use when

Predicting treatment complications in rare diseases with limited datasets; when binary outcomes are of primary interest; during exploratory modeling in early-stage HTA for novel therapies

Avoid when

High-precision predictions are required; detailed severity grading predictions are needed; large, high-quality datasets are available and more complex models could outperform simple ones; regulatory or clinical decision-making requiring high model performance

Inputs

Clinical variables from patient records related to allogeneic hematopoietic cell transplantation, including donor and recipient characteristics, transplant conditions, and pre-transplant health status

Outputs

Predicted probability and classification of acute graft-versus-host disease (aGvHD) occurrence and grade (e.g., grade I–IV or binary outcome)

How it works

Machine learning models were trained on data from the Center for International Blood and Marrow Transplant Research to predict the occurrence and grading of acute graft-versus-host disease (aGvHD) following allogeneic hematopoietic cell transplantation (allo-HCT) in patients with T-cell prolymphocytic leukemia. Models were evaluated using balanced accuracy, F1 score, and ROC AUC. Linear Discriminant Analysis (LDA) achieved the highest balanced accuracy (0.58) in binary classification for aGvHD occurrence, but performance declined in multi-class classification for aGvHD grades. The study highlights challenges in applying ML to rare diseases due to limited sample sizes and variable sparsity.

Project: HTx
Funding: Horizon 2020
Project status: Completed 2024
HTA domains: Clinical Effectiveness
Categories: Predictive Modelling ML/AI RWE
Technology: Medicines
Assumptions: The relationship between input clinical variables and aGvHD outcomes can be modeled using machine learning; sufficient signal exists in the variables despite limited sample size; data from the CIBMTR registry is representative of the target population
Strengths: Tailored to rare diseases with limited data; evaluates multiple ML models and variable selection impacts; uses real-world registry data; focuses on clinically important outcomes
Limitations: Low balanced accuracy (0.58) indicates limited predictive power; multi-class grading predictions were particularly challenging; small sample size due to disease rarity limits model generalizability and training robustness
Also known as: Machine Learning for aGvHD Prediction, ML in Allo-HCT Outcome Prediction

Questions this answers

Related methods

Similar by meaning

Beta record. Generated from the primary source via AI extraction and independent audit, pending final human review.