Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

veezbo · 2023-08-29T02:26:25Z

What does this PR do?

The documentation for efficient single-GPU training previously mentioned that the adamw_bnb_8bit optimizer could only be integrated using a third-party implementation. However, this is now available in Trainer directly as a result of this issue and corresponding PR.

I think it's valuable to keep the 8-bit Adam entry in the documentation as it's a significant improvement over Adafactor. And I also think it's valuable to keep the sample integration with a third-party implementation of an optimizer for reference purposes. I have adjusted the documentation accordingly.

I was able to validate myself that both approaches, using Trainer directly with the optim flag and doing the third-party integration still appear to work when fine-tuning small LLMs on a single GPU.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@stevhliu and @MKhalusova

…_8bit optimizer

ydshieh · 2023-08-29T06:34:38Z

cc @younesbelkada for BNB related stuff 🙏

younesbelkada

This looks good, thanks, I left one question!
cc @SunMarc as well

docs/source/en/perf_train_gpu_one.md

stevhliu

Thanks, I left some suggestions to make it more concise!

docs/source/en/perf_train_gpu_one.md

Co-authored-by: Steven Liu <[email protected]>

veezbo · 2023-08-30T03:21:02Z

Thanks @stevhliu for the suggestions!

younesbelkada

Thanks a lot!

HuggingFaceDocBuilderDev · 2023-08-30T10:46:32Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

amyeroberts

Thanks for adding!

…ptimizer (huggingface#25807) * Modify single-GPU efficient training doc with now-available adamw_bnb_8bit optimizer * Apply suggestions from code review Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Steven Liu <[email protected]>

Modify single-GPU efficient training doc with now-available adamw_bnb…

cc909a7

…_8bit optimizer

younesbelkada reviewed Aug 29, 2023

View reviewed changes

docs/source/en/perf_train_gpu_one.md Show resolved Hide resolved

stevhliu approved these changes Aug 29, 2023

View reviewed changes

docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved

docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved

docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved

docs/source/en/perf_train_gpu_one.md Outdated Show resolved Hide resolved

Apply suggestions from code review

ef89c15

Co-authored-by: Steven Liu <[email protected]>

younesbelkada approved these changes Aug 30, 2023

View reviewed changes

amyeroberts approved these changes Aug 30, 2023

View reviewed changes

amyeroberts merged commit 99fc3ac into huggingface:main Aug 31, 2023

veezbo deleted the vibhorkumar.8bitadam_documentation_update branch September 4, 2023 18:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

veezbo commented Aug 29, 2023 •

edited

Loading

ydshieh commented Aug 29, 2023

younesbelkada left a comment

stevhliu left a comment

veezbo commented Aug 30, 2023

younesbelkada left a comment

HuggingFaceDocBuilderDev commented Aug 30, 2023

amyeroberts left a comment

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

Modify efficient GPU training doc with now-available adamw_bnb_8bit optimizer #25807

Conversation

veezbo commented Aug 29, 2023 • edited Loading

What does this PR do?

Before submitting

Who can review?

ydshieh commented Aug 29, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

stevhliu left a comment

Choose a reason for hiding this comment

veezbo commented Aug 30, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Aug 30, 2023

amyeroberts left a comment

Choose a reason for hiding this comment

veezbo commented Aug 29, 2023 •

edited

Loading