Pytorch implementation for reproducing StackGAN_v2 results in the paper StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang*, Tao Xu*, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.
Note: Code has been updated for Python3 usage. Thank you David Stap for your help upgrading the original StackGAN-v2 file. Also, sometimes during training my computer randomly shut down. I think was because the GPU was pulling in too much power, but be aware of this.
python 3.6+
Pytorch 1.1.0+
In addition, please add the project folder to PYTHONPATH and pip install
the following packages:
tensorboardX
python-dateutil
easydict
pandas
torchfile
Data
- Download our preprocessed char-CNN-RNN text embeddings for birds and save them to
data/
- [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
- Download the birds image data. Extract them to
data/birds/
- Download ImageNet dataset and extract the images to
data/imagenet/
- Download LSUN dataset and save the images to
data/lsun
Training
- Train a StackGAN-v2 model on the bird (CUB) dataset using our preprocessed embeddings:
python main.py --cfg cfg/birds_3stages.yml --gpu 0
- Train a StackGAN-v2 model on the ImageNet dog subset:
python main.py --cfg cfg/dog_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the ImageNet cat subset:
python main.py --cfg cfg/cat_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the lsun bedroom subset:
python main.py --cfg cfg/bedroom_3stages_color.yml --gpu 0
- Train a StackGAN-v2 model on the lsun church subset:
python main.py --cfg cfg/church_3stages_color.yml --gpu 0
*.yml
files are example configuration files for training/evaluation our models.- If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.
Pretrained Model
- StackGAN-v2 for bird. Download and save it to
models/
(The inception score for this Model is 4.04±0.05) - StackGAN-v2 for dog. Download and save it to
models/
(The inception score for this Model is 9.55±0.11) - StackGAN-v2 for cat. Download and save it to
models/
- StackGAN-v2 for bedroom. Download and save it to
models/
- StackGAN-v2 for church. Download and save it to
models/
Evaluating
- Run
python main.py --cfg cfg/eval_birds.yml --gpu 1
to generate samples from captions in birds validation set. - Change the
eval_*.yml
files to generate images from other pre-trained models.
Examples generated by StackGAN-v2
Tsne visualization of randomly generated birds, dogs, cats, churchs and bedrooms
If you find StackGAN useful in your research, please consider citing:
@article{Han17stackgan2,
author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
title = {StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks},
journal = {arXiv: 1710.10916},
year = {2017},
}
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}
Our follow-up work
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [Supplementary][code]
References