added vq_vae_accelerate notebook and trained vq_vae model model4cells #101

ttunja · 2023-03-12T17:43:09Z

This pull request merges vq_vae into accelerate notebook, so that we could try to run diffusion with multiple GPUs in a latent space. Additionally some functions were updated and vg_vae model was added in the data section of dnadiffusion directory

review-notebook-app · 2023-03-12T17:43:14Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

cameronraysmith

Thanks @ttunja !

Do we need to keep https://github.com/pinellolab/DNA-Diffusion/blob/a407f04ac8e7fed8efe46ad571f0107e7267b886/dnadiffusion/data/model4cells_train_split_3_50_dims.pkl in the git repo?
Is it able to be regenerated from code that is currently in the repository?
Does this take a very long time?

The dnadiffusion folder will be moving to src and contains the code of a python package. We would not plan to store binary files such as model checkpoints there in the long-term. We'll be setting up a system to keep data and model artifacts in s3 soon.

LucasSilvaFerreira · 2023-03-14T03:03:20Z

@ssenan What is the best way to adapt the code for our new code (instead using a notebook)?

ssenan · 2023-03-14T03:22:43Z

@LucasSilvaFerreira I have a couple PRs coming this week that updates the whole codebase to be used with pytorch lightning / hydra-zen. From there we can create a couple new scripts/configs that capture VectorQuantizer, VectorQuantizerEMA, and the encoder/decoder model.

I think that this will make it easier to compare how this is performing relative to the current model, and also makes it easier to update the architecture if we need to. This is probably good to merge in after the changes @cameronraysmith suggested are implemented, as it's probably one of the last notebooks floating around.

…_train...

cameronraysmith · 2023-03-21T22:33:39Z

@ttunja we can go ahead and merge this since the model is ultimately a small file.
We should plan to address in #106. Then we can remove it from the repository.

cameronraysmith

Will follow-up with #106 .

cameronraysmith · 2023-03-21T22:34:45Z

dnadiffusion/data/model4cells_train_split_3_50_dims.pkl

ttunja · 2023-03-23T17:25:31Z

@cameronraysmith It took around 2-3h to train vq_vae (if I remember correctly, since it was done by @noahweber1). We will follow-up in #106

@ssenan are the new things that are coming all notebooks or python scripts? When can we start writing scripts and not notebooks? (codebase update)
@LucasSilvaFerreira why can't we compare everything in this notebook, that was the reason I implemented it in your accelerate notebook, otherwise I would have used my code.

noahweber1 · 2023-03-23T17:59:45Z

@cameronraysmith I already merged a notebook that trains a VQ_VAE for this making it a full stable diffusion:

https://github.com/pinellolab/DNA-Diffusion/blob/main/notebooks/experiments/conditional_diffusion/VQ_VAE_LATENT_SPACE_WITH_METRICS.ipynb

@ttunja

cameronraysmith · 2023-03-23T21:17:46Z

Many thanks @ttunja @noahweber1

cameronraysmith added the model modifies model code in the main package label Mar 12, 2023

cameronraysmith added this to the 0.0.1 milestone Mar 12, 2023

cameronraysmith linked an issue Mar 12, 2023 that may be closed by this pull request

Training a VQ-VAE for DNA-sequences for stable diffusion #16

Closed

cameronraysmith requested changes Mar 12, 2023

View reviewed changes

cameronraysmith requested a review from LucasSilvaFerreira March 12, 2023 22:36

cameronraysmith force-pushed the latent-representation branch from a407f04 to 71b39ec Compare March 12, 2023 22:38

ssenan self-requested a review March 13, 2023 01:13

added vq_vae_accelerate notebook and trained vq_vae model model4cells…

07ac4e0

…_train...

cameronraysmith force-pushed the latent-representation branch from 71b39ec to 07ac4e0 Compare March 21, 2023 22:21

cameronraysmith mentioned this pull request Mar 21, 2023

document the procedure to regenerate model4cells_train_split_3_50_dims.pkl #106

Closed

cameronraysmith self-requested a review March 21, 2023 22:34

cameronraysmith approved these changes Mar 21, 2023

View reviewed changes

dnadiffusion/data/model4cells_train_split_3_50_dims.pkl

Copy link

Collaborator

cameronraysmith Mar 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #106

cameronraysmith assigned cameronraysmith and ttunja Mar 21, 2023

cameronraysmith merged commit 2a5e561 into main Mar 21, 2023

cameronraysmith deleted the latent-representation branch March 21, 2023 22:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

added vq_vae_accelerate notebook and trained vq_vae model model4cells #101

added vq_vae_accelerate notebook and trained vq_vae model model4cells #101

ttunja commented Mar 12, 2023

review-notebook-app bot commented Mar 12, 2023

cameronraysmith left a comment •

edited

Loading

LucasSilvaFerreira commented Mar 14, 2023

ssenan commented Mar 14, 2023 •

edited

Loading

cameronraysmith commented Mar 21, 2023

cameronraysmith left a comment

cameronraysmith Mar 21, 2023

ttunja commented Mar 23, 2023

noahweber1 commented Mar 23, 2023

cameronraysmith commented Mar 23, 2023

added vq_vae_accelerate notebook and trained vq_vae model model4cells #101

added vq_vae_accelerate notebook and trained vq_vae model model4cells #101

Conversation

ttunja commented Mar 12, 2023

review-notebook-app bot commented Mar 12, 2023

cameronraysmith left a comment • edited Loading

Choose a reason for hiding this comment

LucasSilvaFerreira commented Mar 14, 2023

ssenan commented Mar 14, 2023 • edited Loading

cameronraysmith commented Mar 21, 2023

cameronraysmith left a comment

Choose a reason for hiding this comment

cameronraysmith Mar 21, 2023

Choose a reason for hiding this comment

ttunja commented Mar 23, 2023

noahweber1 commented Mar 23, 2023

cameronraysmith commented Mar 23, 2023

cameronraysmith left a comment •

edited

Loading

ssenan commented Mar 14, 2023 •

edited

Loading