Attentive Diagnostics: Enhancing Chest X-Ray Analysis Through Integrated Attention Mechanisms and SVM Classification
Juan David Gomez Villalba
Department of Computer Science
University of London
Student Email
Personal Email
Deep Learning On A Public Dataset
Final Project For CM3070 Computer Science Final Project
For a comprehensive understanding of the project, methodologies, results, and conclusions, please refer to our final report:
"Attentive Diagnostics" is the culmination of a rigorous investigation into advancing the field of medical diagnostics through artificial intelligence. This project's centerpiece is the novel integration of attention mechanisms with convolutional neural networks (CNNs), synergized with Support Vector Machine (SVM) classification to enhance diagnostic accuracy in analyzing chest X-rays (CXRs).
Chest X-rays are pivotal in diagnosing various thoracic diseases. However, their interpretation can be complex and error-prone. Leveraging recent strides in AI, this project introduces a methodical approach to refining the diagnosis process. By combining CNNs with attention mechanisms and SVM classification, this project sets out to create an efficient, AI-assisted method for medical image analysis.
This project utilizes extensive dataset of the NIH ChestX-ray14.
The project consists of several phases, each dedicated to different facets of model development—from prototype models using basic CNN architectures to the sophisticated integration of ResNet-18 with Convolutional Block Attention Module (CBAM). Evaluation strategies are meticulously devised, prioritizing AUC-ROC, precision-recall curves, and F-1 scores.
The final model is located here: Baseline Model Notebook For an end-to-end data to model example, see: ResNet-18 CBAM Notebook
- Baseline
- Baseline Final
- Mobile Net
- Resnet-18-CBAM
- Resnet-18 Old
- VGGNet19_transferLearning
- VGG_SVM
- VVGNet-16
The implementation of CBAM has notably boosted model performance, as reflected by the enhanced AUC scores across different pathologies.
The integration of attention mechanisms via CBAM with ResNet-18 leads to the most promising results. It not only improves the model's overall ability to differentiate between pathologies but also shows that focusing the model's attention on specific areas of an image can significantly enhance performance.
Pre-processes a specified image from a dataset, adjusts it for the model input, and then employs the model to predict the likelihood of each pathology.
The model outputs probabilities that indicate its confidence in the presence of each condition in the image. Visually, the function presents both the image and the model's predictions side by side. It displays the image on the left with its true labels for reference. On the right, it features a horizontal bar chart that represents the model's predicted probabilities for each class label, allowing for an immediate visual assessment of the model's performance.
- Clone this repository.
- Run the Jupyter notebooks to recreate the models and observe their performances.
This project is licensed under the MIT License - see the LICENSE.md file for details.
A heartfelt thanks to the mentors, peers, and academic facilitators who guided and supported this project's fruition.
For queries, please reach out to Juan David Gomez Villalba at [email protected].