JoJoGAN: One Shot Face Stylization w/ video results & training script

This is the PyTorch implementation of JoJoGAN: One Shot Face Stylization.

Abstract:
While there have been recent advances in few-shot image stylization, these methods fail to capture stylistic details that are obvious to humans. Details such as the shape of the eyes, the boldness of the lines, are especially difficult for a model to learn, especially so under a limited data setting. In this work, we aim to perform one-shot image stylization that gets the details right. Given a reference style image, we approximate paired real data using GAN inversion and finetune a pretrained StyleGAN using that approximate paired data. We then encourage the StyleGAN to generalize so that the learned style can be applied to all other images.

This is a forked Windows Installation Tutorial and the main codes will not be updated

Follow this YouTube tutorial to understand the installation process more easily and if you have any questions feel free to join my discord and ask there. Codes are mostly taken from the official google colab, and modified for local use.

Setup Environment

Step 0: Download anaconda

Download this repository

Step 1:

conda create -n jojo python=3.7
conda activate jojo
cd <your codes file directory here>

Step 2 option 1: 30 series NVIDIA GPU

conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

Step 2 option 2: none 30 series NVIDIA GPU

conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch

Step 2 option 3: CPU only (no NVIDIA GPU)

conda install pytorch torchvision torchaudio cpuonly -c pytorch

Step 3

pip install -r requirements.txt
pip install dlib
conda install -c conda-forge ffmpeg

Download Models

checkpoints:

pretrained style models (optional):

model structure

📂JoJoGAN/ # this is root
├── 📂models/
│	├── 📜stylegan2-ffhq-config-f.pt
│	├── 📜e4e_ffhq_encode.pt
│	├── 📜restyle_psp_ffhq_encode.pt
│	├── 📜dlibshape_predictor_68_face_landmarks.dat
│	├── 📜<any pretrained style models>
│	│...
│...

Evaluate a Pretrained Style Model on Image

Download the pretrained style model and put it under the models folder like in the diagram shown above. Put the input image in the test_input folder, in the following image_name, you don't need to provide the file path, just the file name.

python evaluate.py --input <image_name> --model_name <model_name> --seed <random_seed> --device <cuda/cpu>

eg.

python evaluate.py --device cuda --input iu.jpeg --model_name jojo --seed 3000

Evaluate a Pretrained Style Model on Video

Put the input video in the test_input folder, in the following video_name, you don't need to provide the file path, just the file name.

python evaluate.py --input <video_name> --model_name <model_name> --seed <random_seed> --device <cuda/cpu>

eg.

python evaluate.py --device cuda --input elon.mp4 --model_name jojo --seed 3000

Train a Custom Model

Add images with the same style into the folder style_images. See inside the folder for example.

python train_custom_style.py --model_name <new_name> --alpha <alpha_value> --preserve_color <True/False> --num_iter <number_of_iterations> --device <cuda/cpu>

model_name: give your new model a name, maybe based on the style images?
alpha: the alpha value that'll determine the strength of the style. 0 = strongest, 1 = weakest. Float value between 0 and 1
preserve_color: To whether preserve the color from the style images. This should be a boolean True or False
num_iter: Number of iterations for the training. Usually 300 ~ 500 iter would be fine
device: If you don't have NVIDIA GPU with CUDA, use cpu. Otherwise, cuda (basically the default and you don't need to declare)

eg.

python train_custom_style.py --model_name custom --alpha 0.0 --preserve_color False --num_iter 300 --device cuda

To evaluate the model, follow the previous step will do, just change the model_name to the one you just created. It'll just be like:

python evaluate.py --device cuda --input iu.jpeg --model_name custom --seed 3000

Force training (manual align style image)

When your style's face cannot be detected you can try using force_train.py. This is how I trained the colossal model. Save this image, drag it into photoshop or photopea, match the style image you want with the features of this colossal titan. Eyes to eyes, nose to nose, ears to ears, jaws to jaws if possible. The more accurate the better. Drag it into the style_images_aligned folder and do:

python train_custom_style.py --model_name <insert_name_here> --force_name <insert_style_image_here> --num_iter 300 --device cuda

and after getting the trained model, you can evaluate normally like any other models.

my fork edits end here.

Updates

2021-12-22 Integrated into Replicate using cog. Try it out
2022-02-03 Updated the paper. Improved stylization quality using discriminator perceptual loss. Added sketch model
2021-12-26 Added wandb logging. Fixed finetuning bug which begins finetuning from previously loaded checkpoint instead of the base face model. Added art model
2021-12-25 Added arcane_multi model which is trained on 4 arcane faces instead of 1 (if anyone has more clean data, let me know!). Better preserves features
2021-12-23 Paper is uploaded to arxiv.
2021-12-22 Integrated into Huggingface Spaces 🤗 using Gradio. Try it out
2021-12-22 Added pydrive authentication to avoid download limits from gdrive! Fixed running on cpu on colab.

How to use

Everything to get started is in the colab notebook.

Citation

If you use this code or ideas from our paper, please cite our paper:

@article{chong2021jojogan,
  title={JoJoGAN: One Shot Face Stylization},
  author={Chong, Min Jin and Forsyth, David},
  journal={arXiv preprint arXiv:2112.11641},
  year={2021}
}

Acknowledgments

This code borrows from StyleGAN2 by rosalinity, e4e. Some snippets of colab code from StyleGAN-NADA

Name		Name	Last commit message	Last commit date
Latest commit History 98 Commits
e4e		e4e
inversion_codes		inversion_codes
op		op
results		results
style_images		style_images
style_images_aligned		style_images_aligned
teasers		teasers
test_input		test_input
LICENSE		LICENSE
README.md		README.md
cog.yaml		cog.yaml
e4e_projection.py		e4e_projection.py
evaluate.py		evaluate.py
evaluate_video.py		evaluate_video.py
model.py		model.py
predict.py		predict.py
requirements.txt		requirements.txt
stylize.ipynb		stylize.ipynb
train_custom_style.py		train_custom_style.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

JoJoGAN: One Shot Face Stylization w/ video results & training script

This is a forked Windows Installation Tutorial and the main codes will not be updated

Setup Environment

Download Models

Evaluate a Pretrained Style Model on Image

Evaluate a Pretrained Style Model on Video

Train a Custom Model

Force training (manual align style image)

Updates

How to use

Citation

Acknowledgments

About

Releases

Packages

Languages

License

FilipeF12/JoJoGAN-Training-Windows

Folders and files

Latest commit

History

Repository files navigation

JoJoGAN: One Shot Face Stylization w/ video results & training script

This is a forked Windows Installation Tutorial and the main codes will not be updated

Setup Environment

Download Models

Evaluate a Pretrained Style Model on Image

Evaluate a Pretrained Style Model on Video

Train a Custom Model

Force training (manual align style image)

Updates

How to use

Citation

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages