Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Continuous batching] Late token vector initialization in sampling #649

Merged
merged 2 commits into from
Aug 2, 2024

Conversation

mzegla
Copy link
Collaborator

@mzegla mzegla commented Jul 19, 2024

Changes:

  • Further split of greedy and multinomial paths - using original logits buffer in greedy and whenever possible in multinomial sampling. Sorted vector is created only when top_p or top_k filters need to be applied.
  • Fixing issue with top_k filter being applied always when multinomial sampling is used unless it's explicitly set to 0. Now default value (which is max for size_t) will not trigger applying top_k filter. The filter will also not be applied if top_k is bigger than logits vector size.
  • Skipping multinomial tests

@iefode iefode self-assigned this Jul 19, 2024
@mzegla mzegla requested review from Wovchena, iefode and popovaan July 22, 2024 09:12
src/cpp/src/logit_processor.hpp Show resolved Hide resolved
src/cpp/src/logit_processor.hpp Show resolved Hide resolved
src/cpp/src/logit_processor.hpp Show resolved Hide resolved
Copy link
Contributor

@iefode iefode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general LGTM

" position of the Z-shaped groove?\n0.41\nWhat is the current position of the Z-shaped groove?\n0.11\n",
" status of all of this? I can't stop thinking about it.\nIt's been a while since I've seen it. I found it a",
" status of your blog? Do you accept feedback?\nYes, I’m happy to accept feedback at this time (I’m a"
" condition of the leg?\nIt's been quite a while since I've seen it, so I didn't really know if it was good or bad",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit strange refs..

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multinomial tests for preemption fail on master, so perhaps there's something wrong going on with it and we get strange outputs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src/cpp/src/sampler.hpp Outdated Show resolved Hide resolved
@mzegla mzegla marked this pull request as ready for review July 23, 2024 13:55
src/cpp/src/logit_processor.hpp Outdated Show resolved Hide resolved
src/cpp/src/logit_processor.hpp Show resolved Hide resolved
src/cpp/src/logit_processor.hpp Outdated Show resolved Hide resolved
@mzegla mzegla added this pull request to the merge queue Jul 24, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 24, 2024
@mzegla mzegla added this pull request to the merge queue Jul 24, 2024
@mzegla mzegla removed this pull request from the merge queue due to a manual request Jul 24, 2024
@mzegla mzegla added this pull request to the merge queue Jul 24, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jul 24, 2024
@ilya-lavrenov ilya-lavrenov self-assigned this Jul 31, 2024
@mzegla mzegla added this pull request to the merge queue Aug 1, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 1, 2024
@ilya-lavrenov ilya-lavrenov added this to the 2024.4 milestone Aug 1, 2024
@ilya-lavrenov ilya-lavrenov enabled auto-merge August 1, 2024 15:11
mzegla added 2 commits August 2, 2024 10:33
gtest adjustment

coverity

revert top_k fix

remove additional timers

is_vector_initialized

uncomment top_k fix and skip multinomial tests
@ilya-lavrenov ilya-lavrenov added this pull request to the merge queue Aug 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024
@mzegla mzegla added this pull request to the merge queue Aug 2, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Aug 2, 2024
@mzegla mzegla added this pull request to the merge queue Aug 2, 2024
@mzegla mzegla removed this pull request from the merge queue due to a manual request Aug 2, 2024
@mzegla mzegla added this pull request to the merge queue Aug 2, 2024
Merged via the queue into openvinotoolkit:master with commit 3304798 Aug 2, 2024
27 checks passed
@mzegla mzegla deleted the late_init branch August 19, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants