AI-Powered Cholecystectomy Surgeries Video Analysis and Explanation System
Revolutionizing surgical education and analysis through advanced AI technology
EndoInsight AI combines cutting-edge computer vision and natural language processing to provide detailed, expert-level analysis of cholecystectomy (gallbladder removal) surgery videos. Our system offers valuable insights for medical students, surgeons, and healthcare professionals, enhancing understanding of surgical techniques and anatomical structures.
Python 3.10
FastAPI: Web framework for building APIs
MongoDB: NoSQL DB to store the processed output
OpenCV: For video processing and frame extraction
YOLOv8 Segmentation: For object segmentation
Amazon Bedrock Claude 3 Haiku: For textual explanation generation
Google Cloud Storage (GCS): For storing processed videos
Vertex AI: For YOLOv8 object segmentation model training
CholecSeg8k Dataset: The endoscopic images dataset used for model training
React Vite: For building the user interface
Zustand: For State Management
Axios: For making HTTP requests to the backend
DaisyUI: For UI components and styling
Tailwind CSS: For utility-first CSS
DynaUI: For Text Animation
React Player: For video embedding
- Description: Load endoscopic video files and extract key frames for further analysis.
- Description: Analyze each frame using YOLOv8 to perform object segmentation, identifying the classes of objects in the endoscopic images.
- Description: Extract relevant features from the YOLOv8 segmentation results, including object locations, sizes, and classifications.
- Description: Generate detailed textual explanations of the medical images based on the YOLOv8 segmentation results and extracted features.
- Description: Combine visual segmentation data and generated textual explanations to produce a contextually rich output that reflects both the visual content and the corresponding text.
- Description: Generate a final output that presents the original image, YOLOv8 segmentation results, and textual explanations in an integrated format.
- Description: Develop a simple web interface for uploading images, running analyses, and viewing the integrated image-text output. Users should be able to interact with both the image and the generated text.
All type of contributions are welcome. You may contribute by reporting bugs, suggesting new features, translating the extension or even by submitting a pull request.
git clone
cd <path_to_cloned_repo>/endoinsight-ai/frontend
npm install
npm run dev
VITE v5.4.3 ready in 277 ms
➜ Local: http://localhost:5173/
➜ Network: use --host to expose
➜ press h + enter to show help
access the http://localhost:5173/
get service account JSON from GCP Project in order to use GCP Service. Make sure to store it into the repo and run the command below before runs the backend python code.
cd <path_to_cloned_repo>/endoinsight-ai
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
INFO: Started server process [61941]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on (Press CTRL+C to quit)
access the
cd ~/
gcsfuse --implicit-dirs --rename-dir-limit=100 --max-conns-per-host=100 $MY_BUCKET "/home/jupyter/<cloned_repo_name>/gcs"
Method: POST
- Video File: File
Query Structure example:
"url": "",
"method": "POST",
"headers": {
"accept": "application/json",
"Content-Type": "multipart/form-data"
"body": {
"file": {
"type": "file",
"content": "@<uploaded_video>.mp4",
"media_type": "video/mp4"
- Curl
curl -X 'POST' \
'' \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@<uploaded_video>.mp4;type=video/mp4'
- JSON Object
"success": true,
"message": "Video <uuid>.mp4 processed successfully.",
"content": {
"_id" = "xxx",
"analysis_result": {
"procedure_overview": "xxx",
"observations": [
"identification": {
"structures": [
"instruments": [
"uncertainties": [
"procedural_steps": [
"1. xxx",
"2. xxx",
"3. xxx",
"4. xxx",
"5. xxx",
"6. xxx"
"surgical_technique": [
"critical_moments": [
"clinical_significance": [
"educational_summary": [
"output_video_url": "xxx.mp4"
Method: GET
- Object ID: MongoDB _id for the processed result
Query Structure example:
"method": "GET",
"url": "<object_id>",
"headers": {
"accept": "application/json"
- Curl
curl -X 'GET' \
'<object_id>' \
-H 'accept: application/json'
- JSON Object
"success": true,
"message": "Video data <uuid> retrieved successfully.",
"content": {
"_id" = "xxx",
"analysis_result": {
"procedure_overview": "xxx",
"observations": [
"identification": {
"structures": [
"instruments": [
"uncertainties": [
"procedural_steps": [
"1. xxx",
"2. xxx",
"3. xxx",
"4. xxx",
"5. xxx",
"6. xxx"
"surgical_technique": [
"critical_moments": [
"clinical_significance": [
"educational_summary": [
"output_video_url": "xxx.mp4"