Skip to content

xiaoyi728/RWVM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhancing Open-Vocabulary Object Detection through Region-Word and Region-Vision Matching

OverviewFigure

Installation

This project is based on MMDetection 3.x

It requires the following OpenMMLab packages:

  • MMEngine >= 0.6.0
  • MMCV-full >= v2.0.0rc4
  • MMDetection >= v3.0.0rc6
  • lvisap

Usage

Obtain CLIP Checkpoints

We use CLIP's ViT-B-32 model for the implementation of our method. Obtain the state_dict of the model from GoogleDrive and put it under checkpoints.

Training and Testing

Data preparation

Prepare data following MMdetection. Obtain the json files for OV-COCO from GoogleDrive and put them under data/coco/yichen.The data structure looks like:

checkpoints/
├── clip_vitb32.pth
data/
├── coco
│   ├── annotations
│   │   ├── instances_{train,val}2017.json
│   ├── yichen
│   │   ├── instances_train2017_base.json
│   │   ├── instances_val2017_base.json
│   │   ├── instances_val2017_novel.json
│   │   ├── captions_train2017_tags_allcaps.json
│   ├── train2017
│   ├── val2017
│   ├── test2017

Otherwise, generate the json files using the following scripts:

python tools/pre_processors/keep_coco_base.py \
      --json_path data/coco/annotations/instances_train2017.json \
      --out_path data/coco/yichen/instances_train2017_base.json
python tools/pre_processors/keep_coco_base.py \
      --json_path data/coco/annotations/instances_val2017.json \
      --out_path data/coco/yichen/instances_val2017_base.json
python tools/pre_processors/keep_coco_novel.py \
      --json_path data/coco/annotations/instances_val2017.json \
      --out_path data/coco/yichen/instances_val2017_novel.json

The json file for caption supervision captions_train2017_tags_allcaps.json is obtained following Detic. Put it under data/coco/yichen.

Training

RWM training

Train the detector based on FasterRCNN+ResNet50C4.

CUDA_VISIBLE_DEVICES=0,1,2,3  python -m torch.distributed.launch --nproc_per_node=4 \
./tools/train.py /home/think4090/cy/RWVM-main/configs/rwvm/ov_coco/rwvm_kd_faster_rcnn_r50_caffe_c4_90k.py --launcher pytorch

RVM training

Train the detector based on FasterRCNN+ResNet50C4

CUDA_VISIBLE_DEVICES=0,1,2,3  python -m torch.distributed.launch --nproc_per_node=4 \
./tools/train.py /home/think4090/cy/RWVM-main/configs/rwvm/ov_coco/rwvm_kd_faster_rcnn_r50_caffe_c4_90k.py --launcher pytorch

Testing

OV-COCO

The implementation based on MMDet3.x achieves better results compared to the results reported in the paper. To test the models, run

python ./tools/test.py \ 
path/to/the/cfg/file path/to/the/checkpoint

Acknowledgment

We thank the authors and contributors of BARON and MMdetection.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published