AI-Powered Cholecystectomy Surgeries Video Analysis and Explanation System
Revolutionizing surgical education and analysis through advanced AI technology
EndoInsight AI combines cutting-edge computer vision and natural language processing to provide detailed, expert-level analysis of cholecystectomy (gallbladder removal) surgery videos. Our system offers valuable insights for medical students, surgeons, and healthcare professionals, enhancing understanding of surgical techniques and anatomical structures.
-
Python 3.10
-
FastAPI: Web framework for building APIs
-
MongoDB: NoSQL DB to store the processed output
-
OpenCV: For video processing and frame extraction
-
YOLOv8 Segmentation: For object segmentation
-
Amazon Bedrock Claude 3 Haiku: For textual explanation generation
-
Google Cloud Storage (GCS): For storing processed videos
-
Vertex AI: For YOLOv8 object segmentation model training
-
CholecSeg8k Dataset: The endoscopic images dataset used for model training
-
React Vite: For building the user interface
-
Zustand: For State Management
-
Axios: For making HTTP requests to the backend
-
DaisyUI: For UI components and styling
-
Tailwind CSS: For utility-first CSS
-
DynaUI: For Text Animation
-
React Player: For video embedding
- Description: Load endoscopic video files and extract key frames for further analysis.
- Description: Analyze each frame using YOLOv8 to perform object segmentation, identifying the classes of objects in the endoscopic images.
- Description: Extract relevant features from the YOLOv8 segmentation results, including object locations, sizes, and classifications.
- Description: Generate detailed textual explanations of the medical images based on the YOLOv8 segmentation results and extracted features.
- Description: Combine visual segmentation data and generated textual explanations to produce a contextually rich output that reflects both the visual content and the corresponding text.
- Description: Generate a final output that presents the original image, YOLOv8 segmentation results, and textual explanations in an integrated format.
- Description: Develop a simple web interface for uploading images, running analyses, and viewing the integrated image-text output. Users should be able to interact with both the image and the generated text.
All type of contributions are welcome. You may contribute by reporting bugs, suggesting new features, translating the extension or even by submitting a pull request.
git clone https://github.com/shaunliew/endoinsight-ai.git
cd <path_to_cloned_repo>/endoinsight-ai/frontend
npm install
npm run dev
VITE v5.4.3 ready in 277 ms
➜ Local: http://localhost:5173/
➜ Network: use --host to expose
➜ press h + enter to show help
access the http://localhost:5173/
get service account JSON from GCP Project in order to use GCP Service. Make sure to store it into the repo and run the command below before runs the backend python code.
export GOOGLE_APPLICATION_CREDENTIALS="credential.json"
cd <path_to_cloned_repo>/endoinsight-ai
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python main.py
python main.py
INFO: Started server process [61941]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
access the http://0.0.0.0:8000
MY_BUCKET=<own_bucket_name>
cd ~/
gcsfuse --implicit-dirs --rename-dir-limit=100 --max-conns-per-host=100 $MY_BUCKET "/home/jupyter/<cloned_repo_name>/gcs"
-
Method: POST
-
Route:
/api/process_video/
-
Params:
- Video File: File
-
Query Structure example:
{
"url": "http://0.0.0.0:8000/api/process_video/",
"method": "POST",
"headers": {
"accept": "application/json",
"Content-Type": "multipart/form-data"
},
"body": {
"file": {
"type": "file",
"content": "@<uploaded_video>.mp4",
"media_type": "video/mp4"
}
}
}
- Curl
curl -X 'POST' \
'http://0.0.0.0:8000/api/process_video/' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@<uploaded_video>.mp4;type=video/mp4'
-
Return:
- JSON Object
{
"success": true,
"message": "Video <uuid>.mp4 processed successfully.",
"content": {
"_id" = "xxx",
"analysis_result": {
"procedure_overview": "xxx",
"observations": [
"xxx",
"xxx",
"xxx",
"xxx"
],
"identification": {
"structures": [
"xxx",
"xxx",
"xxx",
"xxx"
],
"instruments": [
"xxx",
"xxx",
"xxx"
],
"uncertainties": [
"xxx"
]
},
"procedural_steps": [
"1. xxx",
"2. xxx",
"3. xxx",
"4. xxx",
"5. xxx",
"6. xxx"
],
"surgical_technique": [
"xxx",
"xxx",
"xxx"
],
"critical_moments": [
"xxx",
"xxx"
],
"clinical_significance": [
"xxx",
"xxx"
],
"educational_summary": [
"xxx",
"xxx",
"xxx"
]
},
"output_video_url": "xxx.mp4"
}
}
-
Method: GET
-
Route:
/api/processed_video/{object_id}
-
Params:
- Object ID: MongoDB _id for the processed result
-
Query Structure example:
{
"method": "GET",
"url": "http://0.0.0.0:8000/api/processed_video/<object_id>",
"headers": {
"accept": "application/json"
}
}
- Curl
curl -X 'GET' \
'http://0.0.0.0:8000/api/processed_video/<object_id>' \
-H 'accept: application/json'
-
Return:
- JSON Object
{
"success": true,
"message": "Video data <uuid> retrieved successfully.",
"content": {
"_id" = "xxx",
"analysis_result": {
"procedure_overview": "xxx",
"observations": [
"xxx",
"xxx",
"xxx",
"xxx"
],
"identification": {
"structures": [
"xxx",
"xxx",
"xxx",
"xxx"
],
"instruments": [
"xxx",
"xxx",
"xxx"
],
"uncertainties": [
"xxx"
]
},
"procedural_steps": [
"1. xxx",
"2. xxx",
"3. xxx",
"4. xxx",
"5. xxx",
"6. xxx"
],
"surgical_technique": [
"xxx",
"xxx",
"xxx"
],
"critical_moments": [
"xxx",
"xxx"
],
"clinical_significance": [
"xxx",
"xxx"
],
"educational_summary": [
"xxx",
"xxx",
"xxx"
]
},
"output_video_url": "xxx.mp4"
}
}