TSD-CAM: Transformer-based Self Distillation with CAM Similarity for Weakly Supervised Semantic Segmentation

Data Preparations

VOC dataset

1. Download

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
tar –xvf VOCtrainval_11-May-2012.tar

2. Download the augmented annotations

The augmented annotations are from SBD dataset. Here is a download link of the augmented annotations at DropBox. After downloading SegmentationClassAug.zip, you should unzip it and move it to VOCdevkit/VOC2012. The directory sctructure should thus be

VOCdevkit/
└── VOC2012
    ├── Annotations
    ├── ImageSets
    ├── JPEGImages
    ├── SegmentationClass
    ├── SegmentationClassAug
    └── SegmentationObject

COCO dataset

1. Download

wget http://images.cocodataset.org/zips/train2014.zip
wget http://images.cocodataset.org/zips/val2014.zip

After unzipping the downloaded files, for convenience, I recommand to organizing them in VOC style.

MSCOCO/
├── JPEGImages
│    ├── train
│    └── val
└── SegmentationClass
     ├── train
     └── val

2. Generating VOC style segmentation labels for COCO

To generate VOC style segmentation labels for COCO dataset, you could use the scripts provided at this repo. Or, just download the generated masks from Google Drive.

Create environment

Clone this repo

git clone https://github.com/rulixiang/toco.git
cd toco

Build Reg Loss

To use the regularized loss, download and compile the python extension, see Here.

Train

To start training, just run:

## for VOC
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 --master_port=29501 scripts/dist_train_voc_seg_neg.py --work_dir work_dir_voc
## for COCO
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=29501 scripts/dist_train_coco_seg_neg.py --work_dir work_dir_coco

Evalution

To evaluation:

## for VOC
python tools/infer_seg_voc.py --model_path $model_path --backbone vit_base_patch16_224 --infer val
## for COCO
CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=29501 tools/infer_seg_voc.py --model_path $model_path --backbone vit_base_patch16_224 --infer val

We mainly use ViT-B and DeiT-B as the backbone, which are based on timm. Also, we use the Regularized Loss. Many thanks to their brilliant works!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
imgs		imgs
logs		logs
model		model
scripts		scripts
tools		tools
utils		utils
README.md		README.md
cos.py		cos.py
show_mask.py		show_mask.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSD-CAM: Transformer-based Self Distillation with CAM Similarity for Weakly Supervised Semantic Segmentation

Data Preparations

1. Download

2. Download the augmented annotations

1. Download

2. Generating VOC style segmentation labels for COCO

Create environment

Clone this repo

Build Reg Loss

Train

Evalution

About

Releases

Packages

Languages

pipizhum/TSD-CAM

Folders and files

Latest commit

History

Repository files navigation

TSD-CAM: Transformer-based Self Distillation with CAM Similarity for Weakly Supervised Semantic Segmentation

Data Preparations

1. Download

2. Download the augmented annotations

1. Download

2. Generating VOC style segmentation labels for COCO

Create environment

Clone this repo

Build Reg Loss

Train

Evalution

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages