Learning Object Placement via Convolution Scoring Attention

Yibin Wang*, Yuchao Feng, Jianwei Zheng†

(†corresponding author)

[Zhejiang University of Techonology]

BMVC 2024

⏬ Download Pre-trained Models

We provide models for TERSE (CVPR 2019) [arXiv], PlaceNet (ECCV 2020) [arXiv], GracoNet(ECCV 2022) [arXiv], CA-GAN(ICME 2023, Oral) [paper] and our CSANet:

	method	FID	LPIPS	url of model & logs
0	TERSE	46.88	0	baidu disk (code: zkk8)
1	PlaceNet	37.01	0.161	baidu disk (code: rap8)
2	GracoNet	28.10	0.207	baidu disk (code: cayr)
3	CA-GAN	23.21	0.268	baidu disk (code: 90yf)
4	CSANet	20.88	0.274	baidu disk (code: l0e6)

🔧 Environment Setup

Install Python 3.6 and PyTorch 1.9.1 (require CUDA >= 10.2):

conda install pytorch==1.9.1 torchvision==0.10.1 torchaudio==0.9.1 cudatoolkit=10.2 -c pytorch

🌓 Data preparation

Download and extract OPA dataset from the official link: google drive. We expect the directory structure to be the following:

<PATH_TO_OPA>
  background/       # background images
  foreground/       # foreground images with masks
  composite/        # composite images with masks
  train_set.csv     # train annotation
  test_set.csv      # test annotation

Then, make some preprocessing:

python tool/preprocess.py --data_root <PATH_TO_OPA>

You will see some new files and directories:

<PATH_TO_OPA>
  com_pic_testpos299/          # test set positive composite images (resized to 299)
  train_data.csv               # transformed train annotation
  train_data_pos.csv           # train annotation for positive samples
  test_data.csv                # transformed test annotation
  test_data_pos.csv            # test annotation for positive samples
  test_data_pos_unique.csv     # test annotation for positive samples with different fg/bg pairs

💻 Training

To train CSANet on a single 3090 GPU with batch size 32 for 18 epochs, run:

python main.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME>

If you want to reproduce the baseline models, just replace main.py with main_terse.py / main_placenet.py / main_graconet.py / main_CA-GAN.py for training.

To see the change of losses dynamically, use TensorBoard:

tensorboard --logdir result/<YOUR_EXPERIMENT_NAME>/tblog --port <YOUR_SPECIFIED_PORT>

🔥 Inference

To predict composite images from a trained CSANet model, run:

python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid <YOUR_EXPERIMENT_NAME> --epoch <EPOCH_TO_EVALUATE> --eval_type evaluni --repeat 10

If you want to infer the baseline models, just replace infer.py with infer_terse.py / infer_placenet.py / infer_graconet.py/ infer_CA-GAN.py.

You could also directly make use of our provided models. For example, if you want to infer our best CSANet model, please 1) download CSANet.zip given above, 2) place it under result and uncompress it:

mv path/to/your/downloaded/CSANet.zip result/CSANet.zip
cd result
unzip CSANet.zip
cd ..

and 3) run:

python infer.py --data_root <PATH_TO_OPA> --expid CSANet --epoch 18 --eval_type eval
python infer.py --data_root <PATH_TO_OPA> --expid CSANet --epoch 18 --eval_type evaluni --repeat 10

The procedure of inferring our provided baseline models are similar. Remember to use --epoch 11 for TERSE, GracoNet, --epoch 9 for PlaceNet and --epoch 15 for CA-GAN.

🌈 Evaluation

To evaluate FID score, run:

sh script/eval_fid.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE> <PATH_TO_OPA/com_pic_testpos299>

To evaluate LPIPS score, run:

sh script/eval_lpips.sh <YOUR_EXPERIMENT_NAME> <EPOCH_TO_EVALUATE>

🙏 Acknowledgements

Some of the evaluation codes in this repo are borrowed and modified from Faster-RCNN-VG, OPA, FID-Pytorch, GracoNet and Perceptual Similarity. Thank them for their great work.

🖊️ BibTeX

If you find CSANet useful or relevant to your research, please kindly cite our paper:

@inproceedings{face-diffuser,
  title={High-fidelity Person-centric Subject-to-Image Synthesis},
  author={Wang, Yibin and Feng Yuchao, Zheng, Jianwei},
  booktitle={BMVC},
  pages={1--13},
  year={2024}
}

📧 Contact

If you have any technical comments or questions, please open a new issue or feel free to contact Yibin Wang.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
eval		eval
loader		loader
result		result
script		script
tool		tool
CSANet.png		CSANet.png
README.md		README.md
infer.py		infer.py
infer_CA-GAN.py		infer_CA-GAN.py
infer_graconet.py		infer_graconet.py
infer_placenet.py		infer_placenet.py
infer_terse.py		infer_terse.py
main.py		main.py
main_CA-GAN.py		main_CA-GAN.py
main_graconet.py		main_graconet.py
main_placenet.py		main_placenet.py
main_terse.py		main_terse.py
model.py		model.py
model_CA-GAN.py		model_CA-GAN.py
model_graconet.py		model_graconet.py
model_placenet.py		model_placenet.py
model_terse.py		model_terse.py
network.py		network.py
network_CA-GAN.py		network_CA-GAN.py
network_graconet.py		network_graconet.py
network_placenet.py		network_placenet.py
network_terse.py		network_terse.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Object Placement via Convolution Scoring Attention

⏬ Download Pre-trained Models

🔧 Environment Setup

🌓 Data preparation

💻 Training

🔥 Inference

🌈 Evaluation

🙏 Acknowledgements

🖊️ BibTeX

📧 Contact

About

Releases

Packages

Languages

CodeGoat24/CSANet

Folders and files

Latest commit

History

Repository files navigation

Learning Object Placement via Convolution Scoring Attention

⏬ Download Pre-trained Models

🔧 Environment Setup

🌓 Data preparation

💻 Training

🔥 Inference

🌈 Evaluation

🙏 Acknowledgements

🖊️ BibTeX

📧 Contact

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages