This project is focused on the accurate and efficient classification of sepsis cases using the FastAPI framework. Sepsis is a critical medical condition that requires prompt identification and treatment.
This project aims to provide a streamlined solution for healthcare professionals to classify sepsis cases quickly and effectively.
The "Sepsis Classification with FastAPI" project aims to develop an accurate and efficient classification system for sepsis cases using the FastAPI framework. Sepsis is a life-threatening condition that requires immediate medical attention. This project addresses the critical need for timely identification and classification of sepsis cases to facilitate prompt treatment and improve patient outcomes.
The objectives of the project are as follows:
-
Train a machine learning model on a diverse dataset of sepsis cases to accurately predict the likelihood of sepsis in patients.
-
Utilize the FastAPI framework to create a user-friendly and efficient web interface for healthcare professionals to interact with the sepsis classification model.
-
Improve diagnostic capabilities by achieving high accuracy, sensitivity, and specificity in sepsis classification.
-
Provide a comprehensive and scalable solution that can be easily deployed in real-time healthcare environments.
Key challenges in this project include acquiring and preprocessing a reliable sepsis dataset, selecting an appropriate machine learning algorithm, optimizing the model's performance, and deploying the system in a secure and efficient manner.
Code | Name | Published Article | Deployed App | Streamlit App |
---|---|---|---|---|
LP6 | Sepsis Prediction App with FastAPI and Streamlit | Medium Article | FastAPI App | Streamlit App |
To set up the project environment, follow these steps:
- Clone the repository:
git clone my_github
https://github.com/aliduabubakari/Sepsis-Classification-with-FastAPI.git
- Install the required dependencies:
pip install -r requirements.txt
- Create a virtual environment:
-
Windows:
python -m venv venv venv\Scripts\activate
-
Linux & MacOS:
python3 -m venv venv source venv/bin/activate
You can copy each command above and run them in your terminal to easily set up the project environment.
The data used in this project consists of a diverse collection of sepsis cases obtained from Sepsis.
Column Name | Data Features | Description |
---|---|---|
ID | N/A | Unique number to represent patient ID |
PRG | Attribute 1 | Plasma glucose |
PL | Attribute 2 | Blood Work Result-1 (mu U/ml) |
PR | Attribute 3 | Blood Pressure (mm Hg) |
SK | Attribute 4 | Blood Work Result-2 (mm) |
TS | Attribute 5 | Blood Work Result-3 (mu U/ml) |
M11 | Attribute 6 | Body mass index (weight in kg/(height in m)^2) |
BD2 | Attribute 7 | Blood Work Result-4 (mu U/ml) |
Age | Attribute 8 | Patient's age (years) |
Insurance | N/A | If a patient holds a valid insurance card |
Sepsis | Target | Positive: if a patient in ICU will develop sepsis, Negative: otherwise |
During the exploratory data analysis (EDA) phase, a comprehensive investigation of the sepsis dataset was conducted to gain insights through various types of analyses.
- Univariate analysis: A thorough examination of each variable individually was performed. Summary statistics such as mean, median, standard deviation, and quartiles were calculated to understand the central tendency and spread of the data.
- Bivariate analysis: Relationships between pairs of variables were explored to identify patterns and potential predictor variables for sepsis classification.
- Multivariate analysis: Relationships among multiple variables were examined simultaneously, allowing for a deeper understanding of their interactions and impact on sepsis.
In addition to these exploratory analyses, hypotheses were formulated based on prior knowledge and existing research. Statistical tests such as t-tests, chi-square tests, or ANOVA tests were utilized to test these hypotheses, depending on the nature of the variables. The results of these tests validated or refuted the formulated hypotheses and provided further insights into the relationships between variables.
-
Hypothesis 1: Higher plasma glucose levels (PRG) are associated with an increased risk of developing sepsis.
-
Hypothesis 2: Abnormal blood work results, such as high values of PL, SK, and BD2, are indicative of a higher likelihood of sepsis.
-
Hypothesis 3: Older patients are more likely to develop sepsis compared to younger patients.
-
Hypothesis 4: Patients with higher body mass index (BMI) values (M11) have a lower risk of sepsis.
-
Hypothesis 5: Patients without valid insurance cards are more likely to develop sepsis.
These hypotheses, along with the results of the EDA, contribute to a deeper understanding of the dataset and provide valuable insights for further analysis and model development.
During the modeling phase, the evaluation of models took into consideration the imbalanced nature of the data. The main metrics used to assess model performance were the F1 score and AUC score, which provide a balanced assessment for imbalanced datasets.
The following models were evaluated:
-
Decision Tree:
-
Logistic Regression:
-
Naive Bayes:
-
Stochastic Gradient Descent:
-
Random Forest:
-
XGBoost:
These models were evaluated based on their F1 and AUC scores, providing insights into their performance on the imbalanced dataset. Below is the results;
Given the imbalanced nature of the data, the models' performance was assessed using the F1 score, which considers both precision and recall, providing a balanced measure of accuracy. Additionally, the AUC score was considered to evaluate the models' ability to distinguish between positive and negative cases.
Hyperparameter tuning was also implemented to optimize the performance of the models. By fine-tuning the hyperparameters, it was possible to identify the best combination of parameter values that yielded the highest performance for each model.
- Make sure you have FastAPI and any necessary dependencies installed. You can install FastAPI using pip:
pip install fastapi
-
Open a terminal or command prompt and navigate to the directory where your main.py file is located.
-
Run the FastAPI application using the uvicorn command, specifying the module and application name:
uvicorn main:app --reload
-
After running the command, you should see output indicating that the FastAPI application is running and listening on a specific address (e.g., http:localhost:8000). This address represents the API endpoint where you can access your application.
-
Open a web browser or use an API testing tool (e.g., Postman) to interact with your deployed FastAPI application. Use the API endpoint provided in the terminal to make requests and receive responses.
The API documentation provides details about the available endpoints, request and response formats, and example usage. You can access the documentation by visiting the /docs endpoint after starting the server (http://localhost:8000/docs).
To run the Docker container based on the provided Dockerfile, follow these steps:
-
Make sure you have Docker installed on your system.
-
Create a new file named Dockerfile (without any file extension) in the root directory of your project.
-
Copy the content of the Dockerfile you provided into the newly created Dockerfile.
-
Open a terminal or command prompt and navigate to the directory where the Dockerfile is located.
-
Build the Docker image by running the following command:
docker build -t your-image-name .
-
Replace your-image-name with the desired name for your Docker image. The . at the end denotes the current directory as the build context.
-
Once the image is built, you can run a Docker container based on that image using the following command:
docker run -d -p host-port:container-port your-image-name
Replace host-port with the port number on your host machine that you want to map to the container's port, and replace container-port with the port number specified in the Dockerfile's EXPOSE instruction (in this case, it's 8000).
For example, if you want to map the container's port 8000 to port 8080 on your host machine, the command would be:
docker run -d -p 8080:8000 your-image-name
- After running the command, the Docker container will start, and your FastAPI application will be running inside the container.
- You can access your application by visiting http://localhost:host-port in your web browser or using an API testing tool.
For example, if you mapped the container's port 8000 to your host's port 8080, you would access the application at http://localhost:8080.
Navigate to the cloned repository and run the command:
pip install -r requirements.txt
To run the demo app (being at the repository root), use the following command:
streamlit run streamlit_app.py
Here's a step-by-step process on how to use the Streamlit App and API Access on Huggingface:
For future work, incorporating clustering algorithms can be a valuable addition to sepsis identification and classification. Clustering algorithms can help in grouping similar patient data together based on patterns and similarities.
Alidu Abubakari
Data Analyst
Azubi Africa