Increasing the token-length based on available memory for GPT models #2280

RezaYazdaniAminabadi · 2022-09-01T07:28:24Z

No description provided.

… alloc

mayank31398 · 2022-09-08T01:40:19Z

Running the new PR
queries = ["cat " * 2000]*4
for max_new_tokens = 10, generated_tokens = [10,10,10,10]
for max_new_tokens = 100, generated_tokens = [100,100,100,99]
for 300 -> [299, 300, 299, 297]
for 1500 -> [1500, 1276, 1500, 369

…ence/support-large-token-length

mayank31398 · 2022-09-12T16:24:17Z

Hi, any update on the above @RezaYazdaniAminabadi ^^?
Were you able to find the error?

RezaYazdaniAminabadi · 2022-09-12T16:25:47Z

Hi @mayank31398,

Looking into it right now, let me first merge this to another PR. I will let you know.
Thanks,
Reza

…soft/DeepSpeed into ds-inference/support-large-token-length

…upport-large-token-length

increasing the token-length based on available memory & reduce memory…

6d7c133

… alloc

RezaYazdaniAminabadi requested review from jeffra, samyam, tjruwase, ShadenSmith, conglongli, awan-10, cli99, eltonzheng, minjiaz, duli2012, mrwyattii, yaozhewei, arashb, xiaoxiawu-microsoft, samadejacobs and cmikeh2 as code owners September 1, 2022 07:28

mallorbc mentioned this pull request Sep 7, 2022

[BUG]GPT Models fail for long inputs and or outputs during inference #2300

Closed

Merge branch 'master' of github.com:microsoft/DeepSpeed into ds-infer…

216b953

…ence/support-large-token-length

RezaYazdaniAminabadi changed the base branch from master to cholmes/fix-long-seq-len-inference September 12, 2022 16:28

Reza Yazdani and others added 5 commits September 12, 2022 21:46

merging

b15f241

Merge branch 'cholmes/fix-long-seq-len-inference' of github.com:micro…

a08a3bf

…soft/DeepSpeed into ds-inference/support-large-token-length

formating

12a5814

fix compile issue

89baf13

Merge branch 'cholmes/fix-long-seq-len-inference' into ds-inference/s…

50d9963

…upport-large-token-length

RezaYazdaniAminabadi requested a review from GuanhuaWang as a code owner September 20, 2022 17:20

Merge branch 'cholmes/fix-long-seq-len-inference' into ds-inference/s…

53ed3cc

…upport-large-token-length

Reza Yazdani and others added 6 commits September 22, 2022 22:42

fix the max_out_tokens to use a dynamic range based on available memory

a149a5a

fix the issue with empty prompt

c9b604d

Merge branch 'cholmes/fix-long-seq-len-inference' into ds-inference/s…

b018ec9

…upport-large-token-length

fix residual-add

fb89f19

fix some issues with unit tests

56f8029

fix formatting

ac2698f

RezaYazdaniAminabadi merged commit d8f5203 into cholmes/fix-long-seq-len-inference Sep 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increasing the token-length based on available memory for GPT models #2280

Increasing the token-length based on available memory for GPT models #2280

RezaYazdaniAminabadi commented Sep 1, 2022

mayank31398 commented Sep 8, 2022

mayank31398 commented Sep 12, 2022

RezaYazdaniAminabadi commented Sep 12, 2022

Increasing the token-length based on available memory for GPT models #2280

Increasing the token-length based on available memory for GPT models #2280

Conversation

RezaYazdaniAminabadi commented Sep 1, 2022

mayank31398 commented Sep 8, 2022

mayank31398 commented Sep 12, 2022

RezaYazdaniAminabadi commented Sep 12, 2022