MiVOS (CVPR 2021) - Scribble To Mask

Ho Kei Cheng, Yu-Wing Tai, Chi-Keung Tang

A simplistic network that turns scribbles to mask. It supports multi-object segmentation using soft-aggregation. Don't expect SOTA results from this model!

Overall structure and capabilities

	MiVOS	Mask-Propagation	Scribble-to-Mask
DAVIS/YouTube semi-supervised evaluation	❌	✔️	❌
DAVIS interactive evaluation	✔️	❌	❌
User interaction GUI tool	✔️	❌	❌
Dense Correspondences	❌	✔️	❌
Train propagation module	❌	✔️	❌
Train S2M (interaction) module	❌	❌	✔️
Train fusion module	✔️	❌	❌
Generate more synthetic data	✔️	❌	❌

Requirements

The package versions shown here are the ones that I used. You might not need the exact versions.

PyTorch 1.6.0
torchvision 0.7.0
opencv-contrib 4.2.0
davis-interactive (https://github.com/albertomontesg/davis-interactive)
gitpython for training
gdown for downloading pretrained models

Refer to the official PyTorch guide for installing PyTorch/torchvision. The rest can be installed by:

pip install opencv-contrib-python gitpython gdown

Pretrained model

Download and put the model in ./saves/. Alternatively use the provided download_model.py.

[OneDrive Mirror]

Interactive GUI

python interactive.py --image <image>

Controls:

Mouse Left - Draw scribbles
Mouse middle key - Switch positive/negative
Key f - Commit changes, clear scribbles
Key r - Clear everything
Key d - Switch between overlay/mask view
Key s - Save masks into a temporary output folder (./output/)

Known issues

The model almost always needs to focus on at least one object. It is very difficult to erase all existing masks from an image using scribbles.

Training

Datasets

Download and extract LVIS training set.
Download and extract a set of static image segmentation datasets. These are already downloaded for you if you used the download_datasets.py in Mask-Propagation.

├── lvis
│   ├── lvis_v1_train.json
│   └── train2017
├── Scribble-to-Mask
└── static
    ├── BIG_small
    └── ...

Commands

Use the deeplabv3plus_resnet50 pretrained model provided here.

CUDA_VISIBLE_DEVICES=0,1 OMP_NUM_THREADS=4 python -m torch.distributed.launch --master_port 9842 --nproc_per_node=2 train.py --id s2m --load_deeplab <path_to_deeplab.pth>

Credit

Deeplab implementation and pretrained model: https://github.com/VainF/DeepLabV3Plus-Pytorch.

Citation

Please cite our paper if you find this repo useful!

@inproceedings{cheng2021mivos,
  title={Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion},
  author={Cheng, Ho Kei and Tai, Yu-Wing and Tang, Chi-Keung},
  booktitle={CVPR},
  year={2021}
}

Contact: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
dataset		dataset
model		model
util		util
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
download_model.py		download_model.py
interactive.py		interactive.py
train.py		train.py
ust_cat.jpg		ust_cat.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiVOS (CVPR 2021) - Scribble To Mask

Overall structure and capabilities

Requirements

Pretrained model

Interactive GUI

Known issues

Training

Datasets

Commands

Credit

Citation

About

Releases 1

Languages

License

hkchengrex/Scribble-to-Mask

Folders and files

Latest commit

History

Repository files navigation

MiVOS (CVPR 2021) - Scribble To Mask

Overall structure and capabilities

Requirements

Pretrained model

Interactive GUI

Known issues

Training

Datasets

Commands

Credit

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Languages