-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Speculative Decoding 2/2 ] Integrate typical acceptance sampler into Spec Decode Worker #5348
Merged
cadedaniel
merged 51 commits into
vllm-project:main
from
sroy745:spec_decode_integrate_accpetance_sampler
Jul 1, 2024
Merged
Changes from 26 commits
Commits
Show all changes
51 commits
Select commit
Hold shift + click to select a range
5650b95
Merge pull request #1 from vllm-project/main
sroy745 8f36146
Merge branch 'vllm-project:main' into main
sroy745 9e75057
Merge branch 'vllm-project:main' into main
sroy745 bbf1484
Integrate Typical Acceptance Sampler into spec decode worker
sroy745 db2c679
Merge branch 'vllm-project:main' into main
sroy745 3495673
Fixing tests
sroy745 26c7c57
adding missing commit
sroy745 090f0bf
reverting changes to conftest
sroy745 733cc6e
reverting changes to conftest
sroy745 19ca0c9
Merge branch 'main' into spec_decode_integrate_accpetance_sampler
sroy745 acf8d2c
Dummy commit
sroy745 2d2b02b
Merge branch 'spec_decode_integrate_accpetance_sampler' of https://gi…
sroy745 2010b35
Revert unnecessary commits
sroy745 8d7512c
Merge branch 'vllm-project:main' into main
sroy745 7fa64b6
Merge remote-tracking branch 'origin/main' into spec_decode_integrate…
sroy745 dea6fbd
Pass only one sampler which can either be the RejectionSampler of the…
sroy745 c3383db
Fix test scripture
sroy745 b15abba
Fix tests
sroy745 6ca731c
Fix tests
sroy745 483c671
Pass only 1 verification_sampler which can either be rejectionSampler…
sroy745 2c6d06c
Update metrics.py to take the base sampler class
sroy745 027b485
Fix tests and comments
sroy745 ded92ac
Fix test fixture and default values of args
sroy745 738871e
Small misc fixes
sroy745 50e8771
Fix spec_decode/test_metrics.py
sroy745 101611e
Merge branch 'main' into spec_decode_integrate_accpetance_sampler
sroy745 5e6638b
Merge branch 'main' into spec_decode_integrate_accpetance_sampler
sroy745 cc760a0
Make rejection_sampler.py and typical_acceptance_sampler.py implement…
sroy745 360ce0b
Raise exception instead of returning None for invalid sampler name
sroy745 6572ba4
Adding log about type of sampler
sroy745 be85f07
Misc comment fixes
sroy745 6dc9efe
Misc fixes
sroy745 512fad9
Misc fixes
sroy745 b1d510c
Misc fixes
sroy745 f4b9e4d
Misc fixes
sroy745 0ea9408
Documentation
sroy745 5772d04
Fix comments
sroy745 b7254e7
Fix arg name
sroy745 ef93081
Fixing a test
sroy745 0165842
Fix comment
sroy745 510974b
Fix formatting
sroy745 396fa54
Fixing tests and lint failures
sroy745 f8cc895
Removing e2e test for TypicalAcceptanceSampler from test_ngram_correc…
sroy745 439117d
Fix a comment
sroy745 75f034f
Dummy commit
sroy745 a0f5ade
Merge pull request #2 from vllm-project/main
sroy745 3082255
Fix format error
sroy745 4e7f51a
Merge pull request #3 from vllm-project/main
sroy745 d26c624
Dummy fix
sroy745 98d5f92
Merge branch 'main' into spec_decode_integrate_accpetance_sampler
sroy745 f186844
Update test_multistep_correctness.py
sroy745 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For testing strategy:
I am concerned that we are adding many E2E tests that don't provide a lot of signal over what already exists. The tradeoff of more tests is that we can accidentally explode CI time. This is because we rely on E2E tests for spec decode correctness and any small regression in model loading or vLLM initialization time can hurt us bad.
So, what I suggest:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a single test in test_multistep_correctness.py to cover different batch size and speculation_length values with TypicalAcceptanceSampler. Added a similar test to test_ngram_correctness.py