Model Performance
ML Model Comparison
All four models were evaluated on a stratified 20% hold-out test set. Metrics are computed at a 0.5 decision threshold.
BEST
GB
Gradient Boosting
200 estimators, learning rate 0.05. Sequential ensemble that corrects prior errors.
0.0%AUC
85.2%
Accuracy
83.7%
F1 Score
86.1%
Precision
RF
Random Forest
200 decision trees with Gini importance scores. Robust to outliers.
0.0%AUC
83.4%
Accuracy
81.2%
F1 Score
84.5%
Precision
ANN
Neural Network (ANN)
128-64-32 MLP with early stopping and 10% validation fraction.
0.0%AUC
82.1%
Accuracy
79.8%
F1 Score
82.3%
Precision
LR
Logistic Regression
Interpretable linear model with Wald test p-values for feature significance.
0.0%AUC
81.0%
Accuracy
78.4%
F1 Score
80.9%
Precision
Full Metrics Table
Threshold = 0.5| Model | AUC | Accuracy | F1 Score | Precision | Recall | MCC |
|---|---|---|---|---|---|---|
GB Gradient Boosting Best | 91% | 85.2% | 83.7% | 86.1% | 81.5% | 0.681 |
RF Random Forest | 89.2% | 83.4% | 81.2% | 84.5% | 78.1% | 0.649 |
ANN Neural Network (ANN) | 87.8% | 82.1% | 79.8% | 82.3% | 77.5% | 0.628 |
LR Logistic Regression | 86.3% | 81% | 78.4% | 80.9% | 76% | 0.608 |
Top 15 Clinical PredictorsAvg. permutation + Gini importance
1
WBC / LeukocytesInflammation
882
ASTHepatic
813
CreatinineRenal Func.
794
Platelet CountCoagulation
735
INR PlasmaCoagulation
686
Total BilirubinHepatic
657
APTT PlasmaCoagulation
608
eGFR CKD-EPIRenal Func.
589
AgeDemographic
5310
GlucoseMetabolic
4811
SodiumElectrolyte
4412
PT PlasmaCoagulation
4113
CalciumElectrolyte
3814
ChlorideElectrolyte
3515
BUNRenal Func.
32Key Findings
1
Gradient Boosting achieved the highest AUC of 91.0%, outperforming all other models.
2
WBC (Leukocytes) and Creatinine emerged as the strongest mortality predictors across all models.
3
MICE imputation with 5 iterations significantly improved model quality by handling missing lab values.
4
All four models surpassed AUC > 0.86, validating the clinical significance of the selected feature set.