Official implementation of the paper "Spatial Content Alignment For Pose Transfer", ICME 2021(Oral)
Wing-Yin Yu, Lai-Man Po, Yuzhi Zhao, Jingjing Xiong, Kin-Wai Lau.
Department of Electrical Engineering, City University of Hong Kong
Due to unreliable geometric matching and content misalignment, most conventional pose transfer algorithms fail to generate fine-trained person images. In this paper, we propose a novel framework – Spatial Content Alignment GAN (SCA-GAN) which aims to enhance the content consistency of garment textures and the details of human characteristics. We first alleviate the spatial misalignment by transferring the edge content to the target pose in advance. Secondly, we introduce a new Content-Style DeBlk which can progressively synthesize photo-realistic person images based on the appearance features of the source image, the target pose heatmap and the prior transferred content in edge domain. We compare the proposed framework with several state-of-the-art methods to show its superiority in quantitative and qualitative analysis. Moreover, detailed ablation study results demonstrate the efficacy of our contributions.
Our generated images on test set can be downloaded from Google Drive or OneDrive.
- python 3.7
- pytorch 1.4
git clone https://github.com/rocketappslab/SCA-GAN.git
cd SCA-GAN
pip install -r requirements.txt
- Download resized train/test images from Google Drive or OneDrive. Put these two folders under the
fashion_data
directory. - Download train/test pairs and train/test key points annotations from Google Drive or OneDrive, including fasion-resize-pairs-train.csv, fasion-resize-pairs-test.csv, fasion-resize-annotation-train.csv, fasion-resize-annotation-train.csv. Put these four files under the
fashion_data
directory. - Generate the pose heatmaps. Note, the space of generated heatmaps are extremely large (~160GB for DeepFashion). Launch
python tool/generate_pose_map_fashion.py
- (For training only) Download edge maps from Google Drive or OneDrive. Put
trainE
under thefashion_data
directory. Or you can use the XDoG method to generate the edge maps by yourself. Other edge detection methods are not fully tested. - (For training only) Download
vgg19-dcbb9e9d.pth
andvgg_conv.pth
from Google Drive or OneDrive for perceptual loss and context loss. PuttrainE
under thefashion_data
directory.
The dataset structure is recommended as:
+—fashion_data
| +--train
| +-- e.g. fashionMENDenimid0000008001_1front.jpg
| +--test
| +-- e.g. fashionMENDenimid0000056501_1front.jpg
| +--trainK
| +-- e.g. fashionMENDenimid0000008001_1front.jpg.npy
| +--testK
| +-- e.g. fashionMENDenimid0000056501_1front.jpg.npy
| +--trainE
| +-- diff
| +-- e.g. MENDenimid0000008001_1front.png
| +—fashion-resize-pairs-train.csv
| +—fashion-resize-pairs-test.csv
| +—fashion-resize-annotation-pairs-train.csv
| +—fashion-resize-annotation-pairs-test.csv
| +—train.lst
| +—test.lst
| +—vgg19-dcbb9e9d.pth
| +—vgg_conv.pth
...
Phase 1 - Prior Content Transfer Network (PCT-Net)
- (Option 1) Train a PCT-Net from scratch. Launch
sh scripts/train_pctnet.sh
- (Option 2) Download pretrained PCT-Net from Google Drive or OneDrive. Extract the folder under
checkpoints
The file structure is recommended as:
+—checkpoints
| +—scagan_pctnet
| +--latest_netG.pth
...
Phase 2 - Image Synthesis Network (IS-Net)
- Train a IS-Net from scratch. Launch
sh scripts/train_isnet.sh
- (Option 1) Once you have finished the training of IS-Net, you can inference the model by launching
sh scripts/test_isnet.sh
The generated imaged are located in result
folder.
- (Option 2) Or you can download our pretrained model from Google Drive or OneDrive. Launch the command of optional 1.
The file structure is recommended as:
+—checkpoints
| +—scagan_isnet
| +--latest_netG.pth
...
We provide a script that can evaluate IS, SSIM, FID and LPIPS metrics.
For evaluation, Tensorflow 1.4.1(python3) is required. Launch
Note: You need to modify the path generated_images_dir = YourImagesPath
python tool/getMetrics_fashion.py
If you use this code for your research, please cite our paper.
@article{yu2021spatial,
title={Spatial Content Alignment For Pose Transfer},
author={Yu, Wing-Yin and Po, Lai-Man and Zhao, Yuzhi and Xiong, Jingjing and Lau, Kin-Wai},
journal={arXiv preprint arXiv:2103.16828},
year={2021}
}
Our code is based on PATN and ADGAN, thanks for their great work.