Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory optimization for gpt_bitcode (#4) #1513

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

astachowiczhabana
Copy link
Contributor

Use torch.matmul instead of torch.baddbmm in
GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.

Use torch.matmul instead of torch.baddbmm in
GPTBigCodeAttention._attn for devices other than cpu. This allows
for using significantly larger batch sizes in text generation
with bigcode-related models.
@astachowiczhabana
Copy link
Contributor Author

Hi @libinta this commit is also required with next OH release

@jiminha jiminha changed the title [SW-204998] Memory optimization for gpt_bitcode (#4) Memory optimization for gpt_bitcode (#4) Nov 25, 2024
@jiminha jiminha added the run-test Run CI for PRs from external contributors label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants