Predicting Pediatric Appendicitis

R Logistic Regression PCA Feature Selection Naïve Bayes Decision Tree Random Forest Boosting

Summary

I worked on predicting appendicitis cases in pediatric patients admitted with abdominal pain. Using clinical, lab, and ultrasound data, I compared classical statistical models with machine learning approaches to identify the most important predictors of appendicitis. Boosting and Random Forest performed best, with features like WBC count, CRP, Neutrophil percentage, Peritonitis, and Length of Stay emerging as the most informative.

About the Data

The dataset comes from the UCI repository, covering pediatric patients in Germany between 2016 and 2021. It includes 782 rows and 58 columns (clinical, lab, ultrasound, and demographic features). After cleaning, we reduced it to 554 rows and 26 features for modeling.

Methods

Feature reduction + Logistic Regression

Forward-backward feature selection
PCA
Ridge regression
Lasso regression

Tree-based models and Naïve Bayes

Naïve Bayes
Decision Tree
Random Forest
XGBoost

Before building the models, I explored the dataset with some visualizations to better understand its structure.

The dataset is balanced across classes, so no re-sampling was needed. The Weight variable is approximately normal with an average around 40 kg (note: weight is in kilograms, not pounds).

Boxplots of WBC count and neutrophil percentage vs diagnosis

Positive cases of appendicitis show higher interquartile ranges and medians for WBC Count. A similar trend is observed for Neutrophil Percentage, suggesting these features could be useful predictors.

Evaluation

Feature reduction + Logistic Regression

Forward-backward selection – best balance of accuracy and interpretability; higher sensitivity at specificity > 0.6.
PCA – good predictive performance with the first 8 components, but lacked interpretability for clinical use.
Ridge regression – did not outperform forward-backward or PCA.
Lasso regression – also underperformed compared to forward-backward and PCA.

ROC curves for logistic regression models with feature reduction

Tree-based models and Naïve Bayes

XGBoost – best-performing model overall, highest accuracy and sensitivity.
Random Forest – strong second place, consistently high accuracy.
Decision Tree – simple and interpretable, but less accurate than ensemble methods.
Naïve Bayes – similar performance to Decision Tree, but required more data preparation without clear gains.

Takeaways

Boosting and Random Forest gave the best predictive performance.
Key predictors of appendicitis were inflammation and infection markers (WBC, Neutrophil %, CRP, Peritonitis) and hospital stay length.
Simpler models were interpretable but underperformed compared to ensemble methods.
Total length of stay is not known before diagnosis, therefore as the possible next steps we can exclude Length_of_Stay from the models.