Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

The official code repository for NeurIPS 2023 paper "Learning Energy-Based Prior Model with Diffusion-Amortized MCMC".

Installation

The implementation depends on the following commonly used packages, all of which can be installed via conda.

Package	Version
PyTorch	1.10.0
pytorch-fid	0.2.1
pytorch-fid-wrapper	0.0.4
numpy	1.21.0

Please refer to this repo if you're having trouble installing pytorch-fid-wrapper.

Datasets and Pre-trained Weights

Pretrained models are available at: https://drive.google.com/drive/folders/18UT4u4vco5TaEJx3HqksXyKP5l_jovUU?usp=sharing.

Training

Image Reconstruction and Generation

# Under the root folder
CUDA_VISIBLE_DEVICES=<GPU_ID> python train_gen_recon.py --dataset <DATASET_ALIAS> --seed <RANDOM_SEED> --log_path <PATH_FOR_TRAINED_WEIGHTS_AND_VIS> --data_path <PATH_TO_DATASETS>

One may want to specify the log_path argument for saving the trained weights and visualization results. Available dataset aliases include (svhn, cifar10, celeba64, celebaHQ). data_path indicates the dataset location. L48-107 of train_gen_recon.py provide more details about how to set-up the data_path argument. Please find other available arguments at L352-405 in the train_gen_recon.py file.

Anomaly Detection

# Under the root folder
CUDA_VISIBLE_DEVICES=<GPU_ID> python train_anomaly_det.py --seed <RANDOM_SEED> --label <HELDOUT_DIGIT> --log_path <PATH_FOR_TRAINED_WEIGHTS_AND_VIS> --data_path <PATH_TO_DATASETS>

The label argument indicates the held-out digit in the MNIST dataset used for anomaly detection. Available options include (1, 4, 5, 7, 9). data_path indicates the dataset location. L58-62 of train_anomaly_det.py provide more details about how to set-up the data_path argument.

Running these training scripts will automatically create the folders for the trained weights and other intermediate results in the log_path.

Evaluation

To evaluate the pre-trained weights, one may consider using the following scripts

Image Reconstruction and Generation

# Under the root folder
CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_gen_recon.py --dataset svhn --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --e_l_step_size 0.4 --g_llhd_sigma 0.1

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_gen_recon.py --dataset cifar10 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --e_l_step_size 1.6 --g_llhd_sigma 0.1

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_gen_recon.py --dataset celeba64 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --e_l_step_size 0.4 --g_llhd_sigma 0.1

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_gen_recon.py --dataset celebaHQ --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --e_l_step_size 0.4 --g_llhd_sigma 1.0

Anomaly Detection

# Under the root folder
CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_anomaly_det.py --label 1 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --g_llhd_sigma .1

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_anomaly_det.py --label 4 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --g_llhd_sigma 1.

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_anomaly_det.py --label 5 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --g_llhd_sigma 1.

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_anomaly_det.py --label 7 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --g_llhd_sigma 1.

CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_anomaly_det.py --label 9 --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --g_llhd_sigma 1.

StyleGAN Inversion

# Under the root folder
CUDA_VISIBLE_DEVICES=<GPU_ID> python eval_stylegan_inv.py --dataset <DATASET_ALIAS> --resume_path <PATH_TO_TRAINED_WEIGHTS> --data_path <PATH_TO_DATASETS> --pretrained_G_path <TO_SPECIFY> --pretrained_E_path <TO_SPECIFY> --pretrained_F_path <TO_SPECIFY>

For styleGAN inversion, the pretrained_G_path is the path to the pre-trained generator weights, and the pretrained_E_path is the path to the encoder weights. pretrained_F_path specifies the path to the vgg model for perceptual loss. Available dataset aliases include (ffhq, lsun_tower).

Toy Example

Run the Code

To train the model on the toy example, one can run the following command in the toy_example folder.

CUDA_VISIBLE_DEVICES=<DEVICE_ID> python toy_example.py --seed <RANDOM_SEED_TO_SPECIFY>

Here --seed argument specifies the random seed, which basically decides the ground-truth posterior distribution. The script will automatically generate a logs/toy/<TIMESTAMP> folder in the toy_example folder, where <TIMESTAMP> indicates the time you started this training process.

Important Tips about Training

For most random seeds, we observed that our learned sampler could achieve decent approximation of the ground-truth posterior distributions obtained by long-run langevin dynamics within 300-3000 training iterations. This would take from several minutes to an hour or so on a NVIDIA RTX A6000 GPU. The training process takes ~2GB GPU memory. It is possible that there are some extreme cases where longer training iterations are needed to produce decent results. For some random seeds, the default 1000-step langevin dynamics for sampling ground-truth posterior distribution might not converge. One may consider using 2000 or more steps by modifying the g_l_steps argument in the sample_langevin_post_z function at L277 in the toy_example/toy_example.py. One possible sign is that the g_loss (avg) Q (reconstruction error obtained by learned posterior samples) is significantly lower than g_loss (avg) L (reconstruction error obtained by langevin dynamics samples).

Citation

@article{yu2023learning,
  title={Learning Energy-Based Prior Model with Diffusion-Amortized MCMC},
  author={Yu, Peiyu and Zhu, Yaxuan and Xie, Sirui and Ma, Xiaojian and Gao, Ruiqi and Zhu, Song-Chun and Wu, Ying Nian},
  journal={arXiv preprint arXiv:2310.03218},
  year={2023}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

Installation

Datasets and Pre-trained Weights

Training

Image Reconstruction and Generation

Anomaly Detection

Evaluation

Image Reconstruction and Generation

Anomaly Detection

StyleGAN Inversion

Toy Example

Run the Code

Important Tips about Training

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learning Energy-Based Prior Model with Diffusion-Amortized MCMC

Installation

Datasets and Pre-trained Weights

Training

Image Reconstruction and Generation

Anomaly Detection

Evaluation

Image Reconstruction and Generation

Anomaly Detection

StyleGAN Inversion

Toy Example

Run the Code

Important Tips about Training

Citation