Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Descript-Audio-Codec model #31494

Merged
merged 49 commits into from
Aug 19, 2024
Merged

Conversation

kamilakesbi
Copy link
Contributor

@kamilakesbi kamilakesbi commented Jun 19, 2024

What does this PR do?

This PR aims at adding Descript-Audio-Codec model, a high fidelity general neural audio codec, to the Transformers library.

This model is composed of 3 components:

  • An Encoder model.
  • A ResidualVectorQuantizer model, which is used with the encoder to obtain the audio quantized latent codes.
  • A Decoder model, used to reconstruct the audio after compression.

This is still a draft PR. Here's what I've done for now:

  1. Adapted the model to Transformers format in modeling_dac.py.
  2. Added the checkpoint conversion scripts, and pushed to the hub the 3 models here (16/24 and 44 khz).
  3. Made sure the forward pass gives the same output as the original model
  4. Added a Feature Extractor (very similar to the Encodec FeatureExtractor).
  5. Started iterating on tests.

Who can review ?

cc @sanchit-gandhi and @ArthurZucker
cc @ylacombe for visibility

@kamilakesbi kamilakesbi changed the title Add dac [WIP] - Add Descript-Audio-Codec model Jun 19, 2024
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/configuration_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/feature_extraction_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/feature_extraction_dac.py Outdated Show resolved Hide resolved
@kamilakesbi
Copy link
Contributor Author

kamilakesbi commented Jun 26, 2024

They indeed use weights with the different losses during training (see original codebase). I'll add weight attributes in the config file.

Note that in the current code, we only return the commitment_loss and codebook_loss, but there are other losses used to train the model (mel_loss and gan losses).

@kamilakesbi
Copy link
Contributor Author

kamilakesbi commented Jun 27, 2024

I took all of Sanchit's reviews and added integration tests. @ylacombe this should be ready for review!

  • Note that there's still one failing test which indicates:

1 failed because AssertionError -> <class 'transformers.models.dac.modeling_dac.DacModel'> is too big for the common tests (74175906)! It should have 1M max.

Should we overwrite this common test in the dac test file ?

cc @sanchit-gandhi

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Contributor

@ylacombe ylacombe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @kamilakesbi,
Thanks for this great PR !!

I've left a few comments, the main ones being:

  1. we definitely should have a method to decode from audio codebooks. We could maybe make the decode method compatible with both audio codebooks and quantized representation, WDYT ?
  2. I'm not quite sure that we should have the losses being computed by default, especially since these losses are alone not enough to train the model - we need a few more losses to train the model if I remember correctly !

Let me know if you got any further questions, but again congrats on the PR, it's looking really great!

src/transformers/models/auto/configuration_auto.py Outdated Show resolved Hide resolved
src/transformers/models/dac/configuration_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/configuration_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
docs/source/en/model_doc/dac.md Show resolved Hide resolved
@kamilakesbi
Copy link
Contributor Author

Thank you for your review @ylacombe!

I have taken your feedback into account and updated the code. I've also added the ability to decode from audio codebooks.

Regarding the loss, I agree with you that we should probably not return the encoder loss by default. Normally, the loss is returned when the labels argument is passed to the forward pass of the model, but here I'm not sure it makes sense to add a labels argument, as the loss is computed in an unsupervised manner. Should I add a return_loss arg instead ?

Otherwise I think this is ready for a final review @amyeroberts :) failling tests are unrelated to this PR I think.

Copy link
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks in great shape - just some minor style nits from me!

src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
tests/models/dac/test_modeling_dac.py Outdated Show resolved Hide resolved
tests/models/dac/test_modeling_dac.py Outdated Show resolved Hide resolved
tests/models/dac/test_modeling_dac.py Outdated Show resolved Hide resolved
@kamilakesbi kamilakesbi changed the title [WIP] - Add Descript-Audio-Codec model Add Descript-Audio-Codec model Jul 4, 2024
Copy link
Contributor

@ylacombe ylacombe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating @kamilakesbi, LGTM!

gentle ping to @amyeroberts and @ArthurZucker for a review!

docs/source/en/model_doc/dac.md Outdated Show resolved Hide resolved
docs/source/en/model_doc/dac.md Outdated Show resolved Hide resolved
Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this model!

There's a few things here and there, mainly the weight_norm logic, but overall looks really good and clean 🤗

src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
tests/models/dac/test_modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
src/transformers/models/dac/modeling_dac.py Outdated Show resolved Hide resolved
docs/source/en/model_doc/dac.md Show resolved Hide resolved
@kamilakesbi
Copy link
Contributor Author

Thanks for the reviews @amyeroberts and @ylacombe!

We should be close from merging this model!

The last change would be to transfer the weights from my personal hugging face page to the descript organisation. I'm waiting for the members of the organisation to add me.

@kamilakesbi
Copy link
Contributor Author

@amyeroberts the checkpoints have been transferred to the Descript organisation.

We can merge this PR if everything if ok for you :)

@kamilakesbi
Copy link
Contributor Author

Gentle ping @amyeroberts

@amyeroberts
Copy link
Collaborator

@kamilakesbi There's still failing tests on the CI - these should be resolved before final review and merge. You may need to rebase on main to include upstream changes or trigger a re-run of the CI if the issues are relating to the environment or other libraries.

@kamilakesbi
Copy link
Contributor Author

kamilakesbi commented Jul 15, 2024

@amyeroberts I've rebased but there are still failing tests which I think are unrelated to this PR. The failing tests indicate the following message:

1 failed because huggingface_hub.utils._errors.RepositoryNotFoundError: 404 Client Error. (Request ID -> Root=1-6694ef78-6ec395936d885db72ebacd65;2c9bbc0d-6ecd-49c4-b28e-df134be7bd4a)

@kamilakesbi
Copy link
Contributor Author

@amyeroberts after rebasing all tests pass on the CI :)

If I get your approval I can merge!

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding and iterating!

Just part of the feature extractor and docs to update

src/transformers/models/dac/feature_extraction_dac.py Outdated Show resolved Hide resolved
docs/source/en/model_doc/dac.md Show resolved Hide resolved
docs/source/en/model_doc/dac.md Outdated Show resolved Hide resolved
src/transformers/models/dac/convert_dac_checkpoint.py Outdated Show resolved Hide resolved
@kamilakesbi kamilakesbi merged commit 8260cb3 into huggingface:main Aug 19, 2024
23 of 25 checks passed
@amyeroberts
Copy link
Collaborator

@kamilakesbi Why was this merged when there were failing slow tests?

kamilakesbi added a commit that referenced this pull request Aug 19, 2024
dataKim1201 pushed a commit to dataKim1201/transformers that referenced this pull request Oct 7, 2024
* dac model

* original dac works

* add dac model

* dac can be instatiated

* add forward pass

* load weights

* all weights are used

* convert checkpoint script ready

* test

* add feature extractor

* up

* make style

* apply cookicutter

* fix tests

* iterate on FeatureExtractor

* nit

* update dac doc

* replace nn.Sequential with nn.ModuleList

* nit

* apply review suggestions 1/2

* Update src/transformers/models/dac/modeling_dac.py

Co-authored-by: Sanchit Gandhi <[email protected]>

* up

* apply review suggestions 2/2

* update padding in FeatureExtractor

* apply review suggestions

* iterate on design and tests

* add integration tests

* feature extractor tests

* make style

* all tests pass

* make style

* fixup

* apply review suggestions

* fix-copies

* apply review suggestions

* apply review suggestions

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <[email protected]>

* Update docs/source/en/model_doc/dac.md

Co-authored-by: Yoach Lacombe <[email protected]>

* anticipate transfer weights to descript

* up

* make style

* apply review suggestions

* update slow test values

* update slow tests

* update test values

* update with CI values

* update with vorace values

* update test with slice

* make style

---------

Co-authored-by: Sanchit Gandhi <[email protected]>
Co-authored-by: Yoach Lacombe <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants