Comparative Evaluation of Pretrained Models for Lung Disease Classification

Benchmarking transfer-learning architectures on chest X-ray classification

Live demo View code

Comparative Evaluation of Pretrained Models for Lung Disease Classification

4 archs Apples-to-apples benchmark

Unified Single dataset + metrics

Reproducible Open notebook

The problem

Hospital radiology departments increasingly want to test off-the-shelf deep models on chest X-rays, but accuracy comparisons across architectures are scattered across papers with different datasets and metrics.

My contribution

Benchmarked four pretrained CNN architectures (ResNet-50, DenseNet-121, EfficientNet-B0, VGG-16) on a unified chest X-ray dataset, controlling for input pipeline, fine-tuning regimen, and evaluation protocol. Measured accuracy, AUC, sensitivity, specificity, and inference time per architecture.

Outcome

Apples-to-apples comparison showing which architectures actually generalize on lung-disease X-rays under matched conditions. Reproducible notebook with full benchmarking pipeline.

What I learned

Controlled comparisons matter — it’s easy to mistake “I tuned model A more carefully” for “model A is better.” The biggest practical takeaway is that for medical-imaging transfer learning, dataset size matters more than backbone size past a certain point.

Type: Model
Role: Solo · model evaluation and benchmarking
Timeframe: 2024
Stack: PythonTensorFlowKerasMatplotlib
Tags: Deep LearningCNNMedical ImagingChest X-rays

🎯The problem

🛠️My contribution

📈Outcome

💡What I learned

Related work

The problem

My contribution

Outcome

What I learned