Skip to content

[BMVC2024] Official implementation of Learning Object Placement via Convolution Scoring Attention

Notifications You must be signed in to change notification settings

CodeGoat24/CSANet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Learning Object Placement via Convolution Scoring Attention

Yibin Wang*, Yuchao Feng, Jianwei Zheng

(†corresponding author)

[Zhejiang University of Techonology]

BMVC 2024

CSANet

⏬ Download Pre-trained Models

We provide models for TERSE (CVPR 2019) [arXiv], PlaceNet (ECCV 2020) [arXiv], GracoNet(ECCV 2022) [arXiv], CA-GAN(ICME 2023, Oral) [paper] and our CSANet:

method FID LPIPS url of model & logs
0 TERSE 46.88 0 baidu disk (code: zkk8)
1 PlaceNet 37.01 0.161 baidu disk (code: rap8)
2 GracoNet 28.10 0.207 baidu disk (code: cayr)
3 CA-GAN 23.21 0.268 baidu disk (code: 90yf)
4 CSANet 20.88 0.274 baidu disk (code: l0e6)

🔧 Environment Setup

Install Python 3.6 and PyTorch 1.9.1 (require CUDA >= 10.2):

conda install pytorch==1.9.1 torchvision==0.10.1 torchaudio==0.9.1 cudatoolkit=10.2 -c pytorch

🌓 Data preparation

Download and extract OPA dataset from the official link: google drive. We expect the directory structure to be the following:

<PATH_TO_OPA>
  background/       # background images
  foreground/       # foreground images with masks
  composite/        # composite images with masks
  train_set.csv     # train annotation
  test_set.csv      # test annotation

Then, make some preprocessing:

python tool/preprocess.py --data_root <PATH_TO_OPA>

You will see some new files and directories:

<PATH_TO_OPA>
  com_pic_testpos299/          # test set positive composite images (resized to 299)
  train_data.csv               # transformed train annotation
  train_data_pos.csv           # train annotation for positive samples
  test_data.csv                # transformed test annotation
  test_data_pos.csv            # test annotation for positive samples
  test_data_pos_unique.csv     # test annotation for positive samples with different fg/bg pairs 

💻 Training

To train CSANet on a single 3090 GPU with batch size 32 for 18 epochs, run:

python main.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME>

If you want to reproduce the baseline models, just replace main.py with main_terse.py / main_placenet.py / main_graconet.py / main_CA-GAN.py for training.

To see the change of losses dynamically, use TensorBoard:

tensorboard --logdir result/<YOUR_EXPERIMENT_NAME>/tblog --port <YOUR_SPECIFIED_PORT>

🔥 Inference

To predict composite images from a trained CSANet model, run:

python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type evaluni --repeat 10

If you want to infer the baseline models, just replace infer.py with infer_terse.py / infer_placenet.py / infer_graconet.py/ infer_CA-GAN.py.

You could also directly make use of our provided models. For example, if you want to infer our best CSANet model, please 1) download CSANet.zip given above, 2) place it under result and uncompress it:

mv path/to/your/downloaded/CSANet.zip result/CSANet.zip
cd result
unzip CSANet.zip
cd ..

and 3) run:

python infer.py --data_root <PATH_TO_OPA> --expid CSANet --epoch 18 --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid CSANet --epoch 18 --eval_type evaluni --repeat 10

The procedure of inferring our provided baseline models are similar. Remember to use --epoch 11 for TERSE, GracoNet, --epoch 9 for PlaceNet and --epoch 15 for CA-GAN.

🌈 Evaluation

To evaluate FID score, run:

sh script/eval_fid.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE> <PATH_TO_OPA/com_pic_testpos299>

To evaluate LPIPS score, run:

sh script/eval_lpips.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE>

🙏 Acknowledgements

Some of the evaluation codes in this repo are borrowed and modified from Faster-RCNN-VG, OPA, FID-Pytorch, GracoNet and Perceptual Similarity. Thank them for their great work.

🖊️ BibTeX

If you find CSANet useful or relevant to your research, please kindly cite our paper:

@inproceedings{face-diffuser,
  title={High-fidelity Person-centric Subject-to-Image Synthesis},
  author={Wang, Yibin and Feng Yuchao, Zheng, Jianwei},
  booktitle={BMVC},
  pages={1--13},
  year={2024}
}

📧 Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Yibin Wang.

About

[BMVC2024] Official implementation of Learning Object Placement via Convolution Scoring Attention

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published