AI/ML for Treatment Effect Prediction
This method uses machine learning to predict how likely patients with a rare blood cancer (T-cell prolymphocytic leukemia) are to develop a serious complication (acute graft-versus-host disease) after a stem cell transplant. Because the disease is rare, there's not much data available, which makes predictions harder. The study tested different machine learning models and found that Linear Discriminant Analysis worked best for predicting whether the complication would occur, though it struggled when trying to predict how severe the complication would be.
At a glance
Use when
Predicting treatment complications in rare diseases with limited datasets; when binary outcomes are of primary interest; during exploratory modeling in early-stage HTA for novel therapies
Avoid when
High-precision predictions are required; detailed severity grading predictions are needed; large, high-quality datasets are available and more complex models could outperform simple ones; regulatory or clinical decision-making requiring high model performance
Inputs
Clinical variables from patient records related to allogeneic hematopoietic cell transplantation, including donor and recipient characteristics, transplant conditions, and pre-transplant health status
Outputs
Predicted probability and classification of acute graft-versus-host disease (aGvHD) occurrence and grade (e.g., grade I–IV or binary outcome)
How it works
Machine learning models were trained on data from the Center for International Blood and Marrow Transplant Research to predict the occurrence and grading of acute graft-versus-host disease (aGvHD) following allogeneic hematopoietic cell transplantation (allo-HCT) in patients with T-cell prolymphocytic leukemia. Models were evaluated using balanced accuracy, F1 score, and ROC AUC. Linear Discriminant Analysis (LDA) achieved the highest balanced accuracy (0.58) in binary classification for aGvHD occurrence, but performance declined in multi-class classification for aGvHD grades. The study highlights challenges in applying ML to rare diseases due to limited sample sizes and variable sparsity.
- Project
- HTx
- Funding
- Horizon 2020
- Project status
- Completed 2024
- HTA domains
- Clinical Effectiveness
- Categories
- Predictive ModellingML/AIRWE
- Technology
- Medicines
- Assumptions
- The relationship between input clinical variables and aGvHD outcomes can be modeled using machine learning; sufficient signal exists in the variables despite limited sample size; data from the CIBMTR registry is representative of the target population
- Strengths
- Tailored to rare diseases with limited data; evaluates multiple ML models and variable selection impacts; uses real-world registry data; focuses on clinically important outcomes
- Limitations
- Low balanced accuracy (0.58) indicates limited predictive power; multi-class grading predictions were particularly challenging; small sample size due to disease rarity limits model generalizability and training robustness
- Also known as
- Machine Learning for aGvHD Prediction, ML in Allo-HCT Outcome Prediction
Questions this answers
- › Can machine learning predict if a patient will get acute graft-versus-host disease after a stem cell transplant?
- › How well can AI predict the severity of complications in rare cancers?
- › Which machine learning model works best for predicting transplant complications with limited data?
- › Does having more health variables improve prediction accuracy for rare diseases?
- › Can AI help improve treatment plans for rare blood cancers?
- › What are the limits of AI when predicting outcomes for very rare conditions?
Similar by meaning
Beta record. Generated from the primary source via AI extraction and independent audit, pending final human review.

