Memory optimization for gpt_bitcode (#4) #1513

astachowiczhabana · 2024-11-22T11:48:33Z

Use torch.matmul instead of torch.baddbmm in
GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.

Use torch.matmul instead of torch.baddbmm in GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.

astachowiczhabana · 2024-11-22T11:49:37Z

Hi @libinta this commit is also required with next OH release

[SW-204998] Memory optimization for gpt_bitcode (#4)

fb2d42e

Use torch.matmul instead of torch.baddbmm in GPTBigCodeAttention._attn for devices other than cpu. This allows for using significantly larger batch sizes in text generation with bigcode-related models.

astachowiczhabana requested a review from ZhaiFeiyue as a code owner November 22, 2024 11:48

jiminha changed the title ~~[SW-204998] Memory optimization for gpt_bitcode (#4)~~ Memory optimization for gpt_bitcode (#4) Nov 25, 2024

hsubramony approved these changes Nov 25, 2024

View reviewed changes

jiminha added the run-test Run CI for PRs from external contributors label Nov 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory optimization for gpt_bitcode (#4) #1513

Memory optimization for gpt_bitcode (#4) #1513

astachowiczhabana commented Nov 22, 2024

astachowiczhabana commented Nov 22, 2024

Memory optimization for gpt_bitcode (#4) #1513

Are you sure you want to change the base?

Memory optimization for gpt_bitcode (#4) #1513

Conversation

astachowiczhabana commented Nov 22, 2024

astachowiczhabana commented Nov 22, 2024