Model Performance

ML Model Comparison

All four models were evaluated on a stratified 20% hold-out test set. Metrics are computed at a 0.5 decision threshold.

BEST

Gradient Boosting

200 estimators, learning rate 0.05. Sequential ensemble that corrects prior errors.

0.0%AUC

85.2%

Accuracy

83.7%

F1 Score

86.1%

Precision

Random Forest

200 decision trees with Gini importance scores. Robust to outliers.

0.0%AUC

83.4%

Accuracy

81.2%

F1 Score

84.5%

Precision

ANN

Neural Network (ANN)

128-64-32 MLP with early stopping and 10% validation fraction.

0.0%AUC

82.1%

Accuracy

79.8%

F1 Score

82.3%

Precision

Logistic Regression

Interpretable linear model with Wald test p-values for feature significance.

0.0%AUC

81.0%

Accuracy

78.4%

F1 Score

80.9%

Precision

Full Metrics Table

Threshold = 0.5

Model	AUC	Accuracy	F1 Score	Precision	Recall	MCC
GB Gradient Boosting Best	91%	85.2%	83.7%	86.1%	81.5%	0.681
RF Random Forest	89.2%	83.4%	81.2%	84.5%	78.1%	0.649
ANN Neural Network (ANN)	87.8%	82.1%	79.8%	82.3%	77.5%	0.628
LR Logistic Regression	86.3%	81%	78.4%	80.9%	76%	0.608

Top 15 Clinical PredictorsAvg. permutation + Gini importance

WBC / LeukocytesInflammation

ASTHepatic

CreatinineRenal Func.

Platelet CountCoagulation

INR PlasmaCoagulation

Total BilirubinHepatic

APTT PlasmaCoagulation

eGFR CKD-EPIRenal Func.

AgeDemographic

GlucoseMetabolic

SodiumElectrolyte

PT PlasmaCoagulation

CalciumElectrolyte

ChlorideElectrolyte

BUNRenal Func.

Key Findings

Gradient Boosting achieved the highest AUC of 91.0%, outperforming all other models.

WBC (Leukocytes) and Creatinine emerged as the strongest mortality predictors across all models.

MICE imputation with 5 iterations significantly improved model quality by handling missing lab values.

All four models surpassed AUC > 0.86, validating the clinical significance of the selected feature set.

Try the Risk Calculator