A Python script for automatic mask generation from video frames, leveraging Meta AI’s Segment Anything model. This notebook enables:
- 🎥 Frame extraction from videos.
- 🎨 Application of segmentation masks.
- 💾 Export of masks in Run-Length Encoding (RLE) format.
- 🤖 Loads the Segment Anything model and its pre-trained weights.
- 🖼️ Automatically generates masks for each video frame.
- 🗄️ Encodes masks in RLE (Run-Length Encoding) format.
- 📊 Visualizes segmented masks directly in the notebook.
-
Python Libraries:
segment-anything
,opencv-python
,torch
,matplotlib
,supervision
,jupyter_bbox_widget
,dataclasses-json
-
Hardware: 🖥️ A GPU is recommended for optimal performance.
-
Model Weights: ⬇️ Download the weights here.
-
Clone the repository:
git clone https://github.com/your-repo/maskGenerator.git cd maskGenerator
-
Install the required dependencies:
pip install -r requirements.txt
-
Download the model weights:
mkdir weights wget https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth -P weights/
-
Prepare your video: 📹 Place your video file in the
data/videoTest
directory. -
Run the notebook:
- 📔 Open the
maskGenerator.ipynb
file. - ✨ Execute the cells in sequence.
- 📔 Open the
-
Results:
- The generated masks are saved in RLE format in the
encodingRLE.txt
file. - Segmented frames can be visualized directly within the notebook.
- The generated masks are saved in RLE format in the
You can choose a specific video frame (based on a timestamp) and run the segmentation. For example:
minutes = 0
seconds = 15.25
frame_id = int(fps * (minutes * 60 + seconds))
cap.set(cv2.CAP_PROP_POS_FRAMES, frame_id)
ret, frame = cap.read()
sam_result = mask_generator.generate(frame)
The notebook includes a function to display the mask contours and their IDs directly on the frame:
show_masks_with_ids(frame, sam_result)
Contributions and suggestions are welcome! Feel free to open an Issue or submit a Pull Request.
This project is distributed under the MIT License. See the LICENSE file for more details.