Genai/optimum support streaming output #1290

zhaohb · 2024-12-04T07:54:07Z

Support chunk streaming mode, mainly to reduce the number of decode calls, thereby improving performance

Support chunk streaming mode, mainly to reduce the number of decode calls, thereby improving performance

zhaohb · 2024-12-05T02:22:20Z

@eaidova @as-suvorov
Could you please review this PR quickly? Thank you very much.

First, that pr supports streaming mode, and secondly, it supports chunk streaming in streaming mode.
Chunk streaming reduces the number of decode calls.
We tested the glm4v-nano model using igpu, chunk streaming can bring about a 30% performance improvement in token rate in streaming mode.

…genai into chunk_streaming

eaidova · 2024-12-05T05:05:16Z

@zhaohb looks like it breaks some other supported use cases, could you please fix:

[ INFO ] Traceback (most recent call last):
  File "/home/runner/work/openvino.genai/openvino.genai/./tools/llm_bench/benchmark.py", line 226, in main
    iter_data_list, pretrain_time, iter_timestamp = CASE_TO_BENCH[model_args['use_case']](
TypeError: run_image_generation_benchmark() takes 6 positional arguments but 8 were given

zhaohb · 2024-12-05T05:35:49Z

@zhaohb looks like it breaks some other supported use cases, could you please fix:

[ INFO ] Traceback (most recent call last):
  File "/home/runner/work/openvino.genai/openvino.genai/./tools/llm_bench/benchmark.py", line 226, in main
    iter_data_list, pretrain_time, iter_timestamp = CASE_TO_BENCH[model_args['use_case']](
TypeError: run_image_generation_benchmark() takes 6 positional arguments but 8 were given

@eaidova thank you very much, I had fix it.

sammysun0711 · 2024-12-05T09:01:34Z

build_jenkins

Support chunk streaming mode, mainly to reduce the number of decode calls, thereby improving performance

Genai/optimum supports streaming output

89c6475

Support chunk streaming mode, mainly to reduce the number of decode calls, thereby improving performance

github-actions bot added the category: llm_bench Label for tool/llm_bench folder label Dec 4, 2024

ilya-lavrenov assigned eaidova and as-suvorov Dec 4, 2024

update fromat

74f5675

zhaohb changed the title ~~Genai/optimum supports streaming output~~ Genai/optimum support streaming output Dec 4, 2024

zhaohb and others added 2 commits December 5, 2024 09:13

update format

5103fb3

Merge branch 'master' into chunk_streaming

c5d81d7

zhaohb added 2 commits December 5, 2024 12:49

Streaming mode only applicable to LLM

e2f0da4

Merge branch 'chunk_streaming' of https://github.com/zhaohb/openvino.…

ec9ee19

…genai into chunk_streaming

eaidova approved these changes Dec 5, 2024

View reviewed changes

eaidova enabled auto-merge December 5, 2024 06:41

Merge branch 'master' into chunk_streaming

d0da92f

eaidova added this pull request to the merge queue Dec 5, 2024

andrei-kochin added the port to LTS PR needs to be ported to LTS label Dec 5, 2024

andrei-kochin added this to the 2025.0 milestone Dec 5, 2024

Merged via the queue into openvinotoolkit:master with commit d294db9 Dec 5, 2024
55 checks passed

sammysun0711 mentioned this pull request Dec 5, 2024

Genai/optimum support streaming mode #1311

Merged

ilya-lavrenov removed the port to LTS PR needs to be ported to LTS label Dec 12, 2024

sungeunk pushed a commit to sungeunk/openvino.genai that referenced this pull request Dec 16, 2024

Genai/optimum support streaming output (openvinotoolkit#1290)

aa418d7

Support chunk streaming mode, mainly to reduce the number of decode calls, thereby improving performance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Genai/optimum support streaming output #1290

Genai/optimum support streaming output #1290

zhaohb commented Dec 4, 2024

zhaohb commented Dec 5, 2024

eaidova commented Dec 5, 2024

zhaohb commented Dec 5, 2024

sammysun0711 commented Dec 5, 2024

Genai/optimum support streaming output #1290

Genai/optimum support streaming output #1290

Conversation

zhaohb commented Dec 4, 2024

zhaohb commented Dec 5, 2024

eaidova commented Dec 5, 2024

zhaohb commented Dec 5, 2024

sammysun0711 commented Dec 5, 2024