Differentially Private Latent Diffusion Models

This repository contains an implementation of the methods described in Differentially Private Latent Diffusion Models. The code is based off a public implementation of Latent Diffusion Models, available here (commit a506df5).

Setting Up Your Enviroment:

This project uses Conda as its package management tool which can downloaded here. Once installed, clone the repository. The remainder of this document will assume the project is stored in a directory called DP-LDM.

Important: We strongly recommend using the Mamba solver for Conda as it dramatically speeds up environment creation.

cd DP-LDM/
conda env create -f environment.yaml
conda activate ldm

Training Your Own Models

Once you have chosen a public/private dataset pair, there are three steps to training your own differentially private latent diffusion models. In each step, you will need to create a configuration file that specifies the hyperparameter of each model. Example config files can be found in DP-LDM/configs/.

Step 1: Autoencoder Pretraining

CUDA_VISIBLE_DEVICES=0 python main.py --base <path to autoencoder yaml> -t --gpus 0,

Step 2: LDM Pretraining

CUDA_VISIBLE_DEVICES=0 python main.py --base <path to dm yaml> -t --gpus 0,

Step 3: Private Fine-tuning

Important: Due to implementation constraints, this step can only be run on a single GPU, specified by the --accelerator gpu command line argument.

CUDA_VISIBLE_DEVICES=0 python main.py \
    --base <path to fine-tune yaml> \
    -t \
    --gpus 0, \
    --accelerator gpu

Sampling

To sample from class-conditional models (e.g. MNIST, FMNIST, CIFAR10):

python sampling/cond_sampling_test.py \
    -y path/to/config.yaml \
    -ckpt path/to/checkpoint.ckpt \
    -c 0 1 2 3 4 5 6 7 8 9

To sample from unconditional models (e.g. CelebA):

python sampling/unonditional_sampling.py \
    --yaml path/to/config.yaml \
    --ckpt path/to/checkpoint.ckpt

Evaluation

We evaulated our models using two metrics. Code for both is available in the repository. For both methods, first follow the section above to generate sufficiently many samples from your model.

Downstream Classification Accuracy

For MNIST, to compute the accuracy, the command is :

python scripts/dpdm_downstreaming_classifier_mnist.py \
    --train path/to/generated_train_images.pt \
    --test path/to/real_test_images.pt

We also provide a script that combines sampling and accuracy computation

python scripts/mnist_sampling_and_acc.py \
    --yaml path/to/config.yaml \
    --ckpt path/to/checkpoint.ckpt

python txt2img.py \
    --yaml path/to/config.yaml \
    --ckpt path/to/checkpoint.ckpt \
    --n_samples 30000 \
    --outname txt2img_samples.pt

FID

First, compute Inception network statistics for the real dataset

python fid/compute_dataset_stats.py \
    --dataset ldm.data.celeba.CelebATrain \
    --args size:32 \
    --output celeba_train_stats.npz

Next, compute the statistics for the generated samples:

python fid/compute_samples_stats.py \
    --samples celeba32_samples.pt \
    --output celeba_samples_stats.npz

Finally, compute FID:

python fid/compute_fid.py \
    --path1 celeba32_train_stats.npz \
    --path2 celeba32_samples_stats.npz

Implementation Comments

We build our code on top of the Latent Diffusion repository. Thanks to the authors for open sourcing their code! We also borrow techniques from Transferring Pretrained Diffusion Probabilistic Models, and would like to thank the authors for privately sending us their code before making it public.

Differences from `latent-diffusion`

Moved the implementation of the DDPM class to a new file ddpm_base.py
Moved callbacks from main.py to callbacks/*.py
Added glob.escape to log folder parsing to support special characters
Changed name of checkpoint created on exception from last.ckpt to on_exception.ckpt
Changed name of checkpoint created on signal from last.ckpt to on_signal.ckpt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Differentially Private Latent Diffusion Models

Setting Up Your Enviroment:

Training Your Own Models

Sampling

Evaluation

Downstream Classification Accuracy

FID

Implementation Comments

Differences from `latent-diffusion`

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
assets		assets
callbacks		callbacks
configs		configs
data		data
fid		fid
ldm		ldm
models		models
sampling		sampling
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
main.py		main.py
notebook_helpers.py		notebook_helpers.py
setup.py		setup.py

License

SaiyueLyu/DP-LDM

Folders and files

Latest commit

History

Repository files navigation

Differentially Private Latent Diffusion Models

Setting Up Your Enviroment:

Training Your Own Models

Sampling

Evaluation

Downstream Classification Accuracy

FID

Implementation Comments

Differences from latent-diffusion

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Differences from `latent-diffusion`

Packages