Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

Merged

Conversation

veezbo
Copy link
Contributor

@veezbo veezbo commented Aug 29, 2023

What does this PR do?

The documentation for efficient single-GPU training previously mentioned that the adamw_bnb_8bit optimizer could only be integrated using a third-party implementation. However, this is now available in Trainer directly as a result of this issue and corresponding PR.

I think it's valuable to keep the 8-bit Adam entry in the documentation as it's a significant improvement over Adafactor. And I also think it's valuable to keep the sample integration with a third-party implementation of an optimizer for reference purposes. I have adjusted the documentation accordingly.

I was able to validate myself that both approaches, using Trainer directly with the optim flag and doing the third-party integration still appear to work when fine-tuning small LLMs on a single GPU.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@stevhliu and @MKhalusova

@ydshieh
Copy link
Collaborator

ydshieh commented Aug 29, 2023

cc @younesbelkada for BNB related stuff 🙏

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks, I left one question!
cc @SunMarc as well

docs/source/en/perf_train_gpu_one.md Show resolved Hide resolved
Copy link
Member

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I left some suggestions to make it more concise!

docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved
docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved
docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved
docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved
@veezbo
Copy link
Contributor Author

veezbo commented Aug 30, 2023

Thanks @stevhliu for the suggestions!

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot!

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

Copy link
Collaborator

@amyeroberts amyeroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding!

@amyeroberts amyeroberts merged commit 99fc3ac into huggingface:main Aug 31, 2023
@veezbo veezbo deleted the vibhorkumar.8bitadam_documentation_update branch September 4, 2023 18:17
parambharat pushed a commit to parambharat/transformers that referenced this pull request Sep 26, 2023
…ptimizer (huggingface#25807)

* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
…ptimizer (huggingface#25807)

* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023
…ptimizer (huggingface#25807)

* Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer

* Apply suggestions from code review

Co-authored-by: Steven Liu <[email protected]>

---------

Co-authored-by: Steven Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants