Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Prompt lookup] #1245

Merged
merged 42 commits into from
Dec 18, 2024
Merged

[Prompt lookup] #1245

merged 42 commits into from
Dec 18, 2024

Conversation

iefode
Copy link
Contributor

@iefode iefode commented Nov 21, 2024

Description:

  • Implementation of Prompt lookup decoding based on continuous batching pipeline (cb_promp_lookup_impl + prompt_lookup_impl)
  • Update prompt_lookup_sample to use new API
  • Update statistic to make of printing more usable

Ticket:

  • CVS-137987

Example of usage:

  • Input: return 0;
  • Result Prompt lookup:
=============================== 
Total duration, ms: 3.02267
Draft model duration, ms: 0.000724718
Main model duration, ms: 3.02195
Draft model duration, %: 0.0239761
Main model duration, %: 99.976
AVG acceptance rate, %: 10.8333
=============================== 
Request_id: 0 ||| 0 0 0 0 0 0 0 0 20 20 0 0 0 0 20 100 80 0 0 0 0 0 0 60 0 0 20 0 0 0 0 0 20 0 0 50
  • Result Greedy:
=============================== 
Total duration, ms: 3.18111
Draft model duration, ms: 1.538e-06
Main model duration, ms: 3.18111
Draft model duration, %: 4.83479e-05
Main model duration, %: 100
AVG acceptance rate, %: -nan
===============================
  • Speedup: 100 Generated tokens: 5.24% && 300 Generated tokens: 81% (9.42 vs 5.19)

@iefode iefode marked this pull request as draft November 21, 2024 10:53
@github-actions github-actions bot added category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers no-match-files category: LLM LLM pipeline (stateful, static) category: speculative decoding Speculative decoding category: cmake / build Cmake scripts category: samples GenAI samples labels Nov 21, 2024
@ilya-lavrenov ilya-lavrenov self-assigned this Nov 26, 2024
@iefode iefode marked this pull request as ready for review December 4, 2024 10:22
@github-actions github-actions bot added the category: GHA CI based on Github actions label Dec 4, 2024
@github-actions github-actions bot added category: GHA CI based on Github actions and removed category: GHA CI based on Github actions labels Dec 4, 2024
@iefode
Copy link
Contributor Author

iefode commented Dec 5, 2024

@ilya-lavrenov Please take a look

@ilya-lavrenov ilya-lavrenov added this to the 2025.0 milestone Dec 5, 2024
Copy link
Contributor

@ilya-lavrenov ilya-lavrenov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is perf benefits the same between SPDA and PA impls?

@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Dec 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 17, 2024
@iefode iefode added this pull request to the merge queue Dec 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 17, 2024
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Dec 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 17, 2024
@iefode iefode added this pull request to the merge queue Dec 17, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 17, 2024
@iefode iefode enabled auto-merge December 17, 2024 13:13
@iefode iefode added this pull request to the merge queue Dec 18, 2024
Merged via the queue into openvinotoolkit:master with commit 9bcadf7 Dec 18, 2024
59 checks passed
@iefode iefode deleted the prompt_lookup branch December 18, 2024 06:53
@Wovchena Wovchena mentioned this pull request Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: cmake / build Cmake scripts category: continuous batching Continuous batching category: GenAI C++ API Changes in GenAI C++ public headers category: GHA CI based on Github actions category: LLM LLM pipeline (stateful, static) category: Python API Python API for GenAI category: samples GenAI samples category: sampling Sampling / Decoding algorithms category: speculative decoding Speculative decoding no-match-files
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants