Fix issue with corrupted output on long generation for GPT #2344

andrewchernyh · 2022-09-22T04:38:10Z

Fix issue described in #2300

Only updating MAX_OUT_TOKES doesn't helps and main issue is in temp_buf that was allocated just after output tensor

Then in attention_unfused
T* workspace = (T*)output + bsz * seq_len * heads * k; where output is temp_buf from ds_softmax_context
In the result temp_buf overlaps query_cont allocated in ds_softmax_context

I've put temp_buf just after kv_cache and it helps

Minimal steps to reproduce:

python3 benchmarks/inference/gpt-bench.py -m EleutherAI/gpt-neo-125M --kernel-inject --deepspeed --dtype=fp32 --max-tokens=1020 --trials 1

ghost · 2022-09-22T04:38:21Z

All CLA requirements met.

andrewchernyh · 2022-09-23T06:18:38Z

Not actual more, see #2300 (comment)

Fix issue with corrupted output on long generation for GPT

d9e34ed

andrewchernyh requested review from jeffra, samyam, tjruwase, ShadenSmith, conglongli, awan-10, cli99, eltonzheng, minjiaz, RezaYazdaniAminabadi, duli2012, mrwyattii, yaozhewei, arashb, xiaoxiawu-microsoft, samadejacobs, cmikeh2 and GuanhuaWang as code owners September 22, 2022 04:38

andrewchernyh mentioned this pull request Sep 22, 2022

[BUG]GPT Models fail for long inputs and or outputs during inference #2300

Closed

Update MAX_OUT_TOKES to 2048 as in GPT-J

13f57ce

andrewchernyh closed this Sep 23, 2022

andrewchernyh mentioned this pull request Sep 26, 2022

Fix issue with corrupted output on long generation for GPT #2359

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue with corrupted output on long generation for GPT #2344

Fix issue with corrupted output on long generation for GPT #2344

andrewchernyh commented Sep 22, 2022

ghost commented Sep 22, 2022 •

edited by ghost

Loading

andrewchernyh commented Sep 23, 2022

Fix issue with corrupted output on long generation for GPT #2344

Fix issue with corrupted output on long generation for GPT #2344

Conversation

andrewchernyh commented Sep 22, 2022

ghost commented Sep 22, 2022 • edited by ghost Loading

andrewchernyh commented Sep 23, 2022

ghost commented Sep 22, 2022 •

edited by ghost

Loading