Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request for AdamW8bit support on CPU (would help TorchTune) #1226

Open
sanchitintel opened this issue May 28, 2024 · 5 comments
Open

Request for AdamW8bit support on CPU (would help TorchTune) #1226

sanchitintel opened this issue May 28, 2024 · 5 comments

Comments

@sanchitintel
Copy link

sanchitintel commented May 28, 2024

Feature request

Port AdamW8bit support for CPU from multi-backend-refactor branch to the main branch

Motivation

Public cloud providers' machines with GPUs are usually expensive while datacenter-grade CPUs are more readily available at lower prices. Towards the goal of making Deep Learning more accessible to developers & learners, the ability to finetune with AdamW8bit on CPU seems like a good milestone. TorchTune is currently unable to support full fine-tuning on CPU with AdamW8bit because it uses bitsandbytes' AdamW8bit optimizer.

#898 enabled AdamW8bit for CPU in multi-backend-refactor branch, but the main branch doesn't have it.

It'd be great if we could enable AdamW8bit for CPU in bitsandbytes main branch before TorchTune's next release (provided there would be a bitsandbytes release before that), so that users who'd install TorchTune would automatically end up installing a version of bitsandbytes that'd support AdamW8bit on CPU.

Thanks!

Your contribution

@jianan-gu could port over his code from multi-backend-refactor branch to the main branch.

cc @mingfeima @ashokei @TimDettmers

@sanchitintel sanchitintel changed the title Request for PagedAdamW8bit support for CPU Request for PagedAdamW8bit support on CPU May 28, 2024
@sanchitintel
Copy link
Author

sanchitintel commented May 28, 2024

#1220 will fix this issue.

@sanchitintel sanchitintel changed the title Request for PagedAdamW8bit support on CPU Request for PagedAdamW8bit support on CPU (would help TorchTune) May 28, 2024
@matthewdouglas
Copy link
Member

matthewdouglas commented May 29, 2024

#1220 will fix this issue.

I don't recall seeing any optimizers implemented yet for CPU, but may be mistaken.

Paged optimizer doesn't make sense to me for CPU, but I can understand the request for AdamW8bit.

@sanchitintel sanchitintel changed the title Request for PagedAdamW8bit support on CPU (would help TorchTune) Request for AdamW8bit support on CPU (would help TorchTune) May 29, 2024
@sanchitintel
Copy link
Author

Thanks for pointing that out, @matthewdouglas! I've revised the description.

@jianan-gu @Xia-Weiwen, please clarify if you had added AdamW8bit implementation for CPU to bitsandbytes. If not, do you have plans to add it? Thanks!

@Xia-Weiwen
Copy link

@sanchitintel Yes, we are going to do it. cc. @jianan-gu @jiqing-feng

@Titus-von-Koeller
Copy link
Collaborator

@sanchitintel thanks for raising this. When is the next torchtune release foreseen?

Hmm, the problem is that the device abstraction / dispatcher situation is still not stable. Things will change fundamentally in the next 3 weeks. Not sure if this can be done as a PR to main in isolation? @Xia-Weiwen could you sketch out a bit more how you think this would make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants