Predicting Pediatric Appendicitis

Summary
I worked on predicting appendicitis cases in pediatric patients admitted with abdominal pain. Using clinical, lab, and ultrasound data, I compared classical statistical models with machine learning approaches to identify the most important predictors of appendicitis. Boosting and Random Forest performed best, with features like WBC count, CRP, Neutrophil percentage, Peritonitis, and Length of Stay emerging as the most informative.
About the Data
The dataset comes from the UCI repository, covering pediatric patients in Germany between 2016 and 2021. It includes 782 rows and 58 columns (clinical, lab, ultrasound, and demographic features). After cleaning, we reduced it to 554 rows and 26 features for modeling.
Methods
Feature reduction + Logistic Regression
- Forward-backward feature selection
- PCA
- Ridge regression
- Lasso regression
Tree-based models and Naïve Bayes
- Naïve Bayes
- Decision Tree
- Random Forest
- XGBoost
Before building the models, I explored the dataset with some visualizations to better understand its structure.

The dataset is balanced across classes, so no re-sampling was needed. The Weight variable is approximately normal with an average around 40 kg (note: weight is in kilograms, not pounds).

Positive cases of appendicitis show higher interquartile ranges and medians for WBC Count. A similar trend is observed for Neutrophil Percentage, suggesting these features could be useful predictors.
Evaluation
Feature reduction + Logistic Regression
- Forward-backward selection – best balance of accuracy and interpretability; higher sensitivity at specificity > 0.6.
- PCA – good predictive performance with the first 8 components, but lacked interpretability for clinical use.
- Ridge regression – did not outperform forward-backward or PCA.
- Lasso regression – also underperformed compared to forward-backward and PCA.

Tree-based models and Naïve Bayes
- XGBoost – best-performing model overall, highest accuracy and sensitivity.
- Random Forest – strong second place, consistently high accuracy.
- Decision Tree – simple and interpretable, but less accurate than ensemble methods.
- Naïve Bayes – similar performance to Decision Tree, but required more data preparation without clear gains.

Takeaways
- Boosting and Random Forest gave the best predictive performance.
- Key predictors of appendicitis were inflammation and infection markers (WBC, Neutrophil %, CRP, Peritonitis) and hospital stay length.
- Simpler models were interpretable but underperformed compared to ensemble methods.
- Total length of stay is not known before diagnosis, therefore as the possible next steps we can exclude Length_of_Stay from the models.