Comparative Evaluation of Pretrained Models for Lung Disease Classification
Benchmarking transfer-learning architectures on chest X-ray classification
The problem
Hospital radiology departments increasingly want to test off-the-shelf deep models on chest X-rays, but accuracy comparisons across architectures are scattered across papers with different datasets and metrics.
My contribution
Benchmarked four pretrained CNN architectures (ResNet-50, DenseNet-121, EfficientNet-B0, VGG-16) on a unified chest X-ray dataset, controlling for input pipeline, fine-tuning regimen, and evaluation protocol. Measured accuracy, AUC, sensitivity, specificity, and inference time per architecture.
Outcome
Apples-to-apples comparison showing which architectures actually generalize on lung-disease X-rays under matched conditions. Reproducible notebook with full benchmarking pipeline.
What I learned
Controlled comparisons matter — it’s easy to mistake “I tuned model A more carefully” for “model A is better.” The biggest practical takeaway is that for medical-imaging transfer learning, dataset size matters more than backbone size past a certain point.
- Type
- Model
- Role
- Solo · model evaluation and benchmarking
- Timeframe
- 2024
- Stack
-
PythonTensorFlowKerasMatplotlib
- Tags
-
Deep LearningCNNMedical ImagingChest X-rays