-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug fix][Core] fixup ngram not setup correctly #4551
Conversation
ngram_prompt_lookup_max/ngram_prompt_lookup_min need to be past through SpecDecodeWorker.create_worker's draft_worker_kwargs. If those two doesn't get past, now there will be exception as dict cannot pop those two keys.
cc @comaniac |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops. We can merge first but it should be better to add a unit test to cover this case.
+1. Let's get a test covering this path. |
Why was it not covered by existing tests? |
I guess existing tests directly initiated the worker, but this is more like an end-to-end path starting from a higher level? |
It is for current ngram still use draft model set as target model to get some info like vocab size. In this failure, ngram testcase is actually turned into multistep case with draft model same as target model... I add a check assert in conftest to ensure we current in ngram running path, when corresponding param is set. |
Co-authored-by: Cody Yu <[email protected]>
Retrying test infra failure |
@cadedaniel this should be able to merge. |
Co-authored-by: Lei Wen <[email protected]> Co-authored-by: Cade Daniel <[email protected]> Co-authored-by: Cody Yu <[email protected]>
Spec decode tests start failing in main branch after this PR https://buildkite.com/vllm/ci/builds/6784#018f551e-d727-491c-be34-9d9fa29f4ea4 |
The fix PR is here: #4672 |
Co-authored-by: Lei Wen <[email protected]> Co-authored-by: Cade Daniel <[email protected]> Co-authored-by: Cody Yu <[email protected]>
Co-authored-by: Lei Wen <[email protected]> Co-authored-by: Cade Daniel <[email protected]> Co-authored-by: Cody Yu <[email protected]>
Co-authored-by: Lei Wen <[email protected]> Co-authored-by: Cade Daniel <[email protected]> Co-authored-by: Cody Yu <[email protected]>
ngram_prompt_lookup_max/ngram_prompt_lookup_min need to be past through SpecDecodeWorker.create_worker's draft_worker_kwargs.
If those two doesn't get past, now there will be exception as dict cannot pop those two keys.