-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369
[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369
Conversation
@cadedaniel @zifeitong @comaniac
|
b9af049
to
ad8390c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Can we enable the test and expect to catch this exception?
96ec39e
to
02dc475
Compare
@comaniac Thanks for the review :) But I'm not sure, the CI seems to have stopped. |
@comaniac The CI test passed :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
if not self.allow_no_draft_tokens and sum( | ||
proposals.proposal_lens) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit worry about the overhead that sum
brings, but I feel it's fine for now given that it won't be triggered with draft model TP=1. wdyt @cadedaniel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, is it possible to store this field in proposals
when we create proposals
? that way we don't need an additional CPU-GPU-CPU sync
""" | ||
self.proposer_worker = proposer_worker | ||
self.scorer_worker = scorer_worker | ||
self.disable_by_batch_size = disable_by_batch_size or float("inf") | ||
self.spec_decode_sampler = spec_decode_sampler | ||
self.allow_no_draft_tokens = allow_zero_draft_token_step |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we mark this private, e.g. _allow_no_draft_tokens
? we should have done this for all properties but we missed it
if not self.allow_no_draft_tokens and sum( | ||
proposals.proposal_lens) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, is it possible to store this field in proposals
when we create proposals
? that way we don't need an additional CPU-GPU-CPU sync
Thanks for the review! I'm gonna handle it right now :) |
2861e99
to
3876858
Compare
@comaniac Is there any problem in CI now? |
3876858
to
4b338d2
Compare
The failure you posted seems random. I'll monitor the current CI run and manually retry failed ones. |
@cadedaniel @comaniac I've updated the code as your suggestion and I think the code passed the test. @comaniac CI has finished. Would you retry the 'documentation-build' test? Thank you! |
@simon-mo can we get a force merge, doc build seems broken |
…-project#6369) Signed-off-by: Alvant <[email protected]>
This PR adds a simple patch to raise an error to prevent users from hitting the hang error stated in #5814.
This error happens only when the skip speculation feature is activated and there are no generated draft tokens for "all" sequences in a step, and draft_tp > 1.
We may revisit this issue later because it's not resolved completely.