Skip to content

This project is focused on the accurate and efficient classification of sepsis cases using the FastAPI framework. Sepsis is a critical medical condition that requires prompt identification and treatment. This project aims to provide a streamlined solution for healthcare professionals to classify sepsis cases quickly and effectively.

Notifications You must be signed in to change notification settings

aliduabubakari/Sepsis-Classification-with-FastAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Sepsis-Classification-with-FastAPI

This project is focused on the accurate and efficient classification of sepsis cases using the FastAPI framework. Sepsis is a critical medical condition that requires prompt identification and treatment.

Streamlit input

This project aims to provide a streamlined solution for healthcare professionals to classify sepsis cases quickly and effectively.

Table of Contents

Project Overview

The "Sepsis Classification with FastAPI" project aims to develop an accurate and efficient classification system for sepsis cases using the FastAPI framework. Sepsis is a life-threatening condition that requires immediate medical attention. This project addresses the critical need for timely identification and classification of sepsis cases to facilitate prompt treatment and improve patient outcomes.

The objectives of the project are as follows:

  1. Train a machine learning model on a diverse dataset of sepsis cases to accurately predict the likelihood of sepsis in patients.

  2. Utilize the FastAPI framework to create a user-friendly and efficient web interface for healthcare professionals to interact with the sepsis classification model.

  3. Improve diagnostic capabilities by achieving high accuracy, sensitivity, and specificity in sepsis classification.

  4. Provide a comprehensive and scalable solution that can be easily deployed in real-time healthcare environments.

Key challenges in this project include acquiring and preprocessing a reliable sepsis dataset, selecting an appropriate machine learning algorithm, optimizing the model's performance, and deploying the system in a secure and efficient manner.

Summary

Code Name Published Article Deployed App Streamlit App
LP6 Sepsis Prediction App with FastAPI and Streamlit Medium Article FastAPI App Streamlit App

Project Setup

To set up the project environment, follow these steps:

  1. Clone the repository:

git clone my_github

https://github.com/aliduabubakari/Sepsis-Classification-with-FastAPI.git
  1. Install the required dependencies:
pip install -r requirements.txt
  1. Create a virtual environment:
  • Windows:

    python -m venv venv
    venv\Scripts\activate
  • Linux & MacOS:

    python3 -m venv venv
    source venv/bin/activate

You can copy each command above and run them in your terminal to easily set up the project environment.

Data

The data used in this project consists of a diverse collection of sepsis cases obtained from Sepsis.

Data Fields

Column Name Data Features Description
ID N/A Unique number to represent patient ID
PRG Attribute 1 Plasma glucose
PL Attribute 2 Blood Work Result-1 (mu U/ml)
PR Attribute 3 Blood Pressure (mm Hg)
SK Attribute 4 Blood Work Result-2 (mm)
TS Attribute 5 Blood Work Result-3 (mu U/ml)
M11 Attribute 6 Body mass index (weight in kg/(height in m)^2)
BD2 Attribute 7 Blood Work Result-4 (mu U/ml)
Age Attribute 8 Patient's age (years)
Insurance N/A If a patient holds a valid insurance card
Sepsis Target Positive: if a patient in ICU will develop sepsis,
Negative: otherwise

Exploratory Data Analysis

During the exploratory data analysis (EDA) phase, a comprehensive investigation of the sepsis dataset was conducted to gain insights through various types of analyses.

  • Univariate analysis: A thorough examination of each variable individually was performed. Summary statistics such as mean, median, standard deviation, and quartiles were calculated to understand the central tendency and spread of the data.

Univariate

  • Bivariate analysis: Relationships between pairs of variables were explored to identify patterns and potential predictor variables for sepsis classification.

Bivariate

  • Multivariate analysis: Relationships among multiple variables were examined simultaneously, allowing for a deeper understanding of their interactions and impact on sepsis.

multivariate

In addition to these exploratory analyses, hypotheses were formulated based on prior knowledge and existing research. Statistical tests such as t-tests, chi-square tests, or ANOVA tests were utilized to test these hypotheses, depending on the nature of the variables. The results of these tests validated or refuted the formulated hypotheses and provided further insights into the relationships between variables.

Hypotheses:

hypothesis

  • Hypothesis 1: Higher plasma glucose levels (PRG) are associated with an increased risk of developing sepsis.

  • Hypothesis 2: Abnormal blood work results, such as high values of PL, SK, and BD2, are indicative of a higher likelihood of sepsis.

  • Hypothesis 3: Older patients are more likely to develop sepsis compared to younger patients.

  • Hypothesis 4: Patients with higher body mass index (BMI) values (M11) have a lower risk of sepsis.

  • Hypothesis 5: Patients without valid insurance cards are more likely to develop sepsis.

These hypotheses, along with the results of the EDA, contribute to a deeper understanding of the dataset and provide valuable insights for further analysis and model development.

Modeling

hypothesis

During the modeling phase, the evaluation of models took into consideration the imbalanced nature of the data. The main metrics used to assess model performance were the F1 score and AUC score, which provide a balanced assessment for imbalanced datasets.

The following models were evaluated:

  • Decision Tree:

  • Logistic Regression:

  • Naive Bayes:

  • Stochastic Gradient Descent:

  • Random Forest:

  • XGBoost:

These models were evaluated based on their F1 and AUC scores, providing insights into their performance on the imbalanced dataset. Below is the results;

Model comparison

Evaluation

hypothesis

Given the imbalanced nature of the data, the models' performance was assessed using the F1 score, which considers both precision and recall, providing a balanced measure of accuracy. Additionally, the AUC score was considered to evaluate the models' ability to distinguish between positive and negative cases.

results

Hyperparameter tuning was also implemented to optimize the performance of the models. By fine-tuning the hyperparameters, it was possible to identify the best combination of parameter values that yielded the highest performance for each model.

Deployment

Fastapi deployment

FastAPI

  1. Make sure you have FastAPI and any necessary dependencies installed. You can install FastAPI using pip:
pip install fastapi
  1. Open a terminal or command prompt and navigate to the directory where your main.py file is located.

  2. Run the FastAPI application using the uvicorn command, specifying the module and application name:

uvicorn main:app --reload
  1. After running the command, you should see output indicating that the FastAPI application is running and listening on a specific address (e.g., http:localhost:8000). This address represents the API endpoint where you can access your application.

  2. Open a web browser or use an API testing tool (e.g., Postman) to interact with your deployed FastAPI application. Use the API endpoint provided in the terminal to make requests and receive responses.

API Documentation

The API documentation provides details about the available endpoints, request and response formats, and example usage. You can access the documentation by visiting the /docs endpoint after starting the server (http://localhost:8000/docs).

FastAPI

FastAPI

Containerized deployment

To run the Docker container based on the provided Dockerfile, follow these steps:

  1. Make sure you have Docker installed on your system.

  2. Create a new file named Dockerfile (without any file extension) in the root directory of your project.

  3. Copy the content of the Dockerfile you provided into the newly created Dockerfile.

  4. Open a terminal or command prompt and navigate to the directory where the Dockerfile is located.

  5. Build the Docker image by running the following command:

docker build -t your-image-name .
  1. Replace your-image-name with the desired name for your Docker image. The . at the end denotes the current directory as the build context.

  2. Once the image is built, you can run a Docker container based on that image using the following command:

docker run -d -p host-port:container-port your-image-name

Replace host-port with the port number on your host machine that you want to map to the container's port, and replace container-port with the port number specified in the Dockerfile's EXPOSE instruction (in this case, it's 8000).

For example, if you want to map the container's port 8000 to port 8080 on your host machine, the command would be:

docker run -d -p 8080:8000 your-image-name
  1. After running the command, the Docker container will start, and your FastAPI application will be running inside the container.

Desktop Docker

  1. You can access your application by visiting http://localhost:host-port in your web browser or using an API testing tool.

For example, if you mapped the container's port 8000 to your host's port 8080, you would access the application at http://localhost:8080.

Streamlit deployment

Navigate to the cloned repository and run the command:

pip install -r requirements.txt

To run the demo app (being at the repository root), use the following command:

streamlit run streamlit_app.py
App Execution on Huggingface

Here's a step-by-step process on how to use the Streamlit App and API Access on Huggingface:

Streamlit input

Streamlit input

streamlit input

streamlit results

Future Work

sepsis solution recommendation

For future work, incorporating clustering algorithms can be a valuable addition to sepsis identification and classification. Clustering algorithms can help in grouping similar patient data together based on patterns and similarities.

Contact

Alidu Abubakari

Data Analyst Azubi Africa

  • LinkedIn

About

This project is focused on the accurate and efficient classification of sepsis cases using the FastAPI framework. Sepsis is a critical medical condition that requires prompt identification and treatment. This project aims to provide a streamlined solution for healthcare professionals to classify sepsis cases quickly and effectively.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published