[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

wooyeonlee0 · 2024-07-12T09:14:08Z

This PR adds a simple patch to raise an error to prevent users from hitting the hang error stated in #5814.
This error happens only when the skip speculation feature is activated and there are no generated draft tokens for "all" sequences in a step, and draft_tp > 1.

We may revisit this issue later because it's not resolved completely.

wooyeonlee0 · 2024-07-12T09:22:21Z

@cadedaniel @zifeitong @comaniac
This PR is to implement the second option (raise an error) in cade's suggestion.
Could any one of you review this?
Maybe I can revisit later to completely fix it, following the first option.

We need to either fix it or raise an error so that users don't hit this.
#5414 (comment)

comaniac

LGTM. Can we enable the test and expect to catch this exception?

wooyeonlee0 · 2024-07-15T07:18:12Z

@comaniac Thanks for the review :)
I've added the test code that catches the error.

But I'm not sure, the CI seems to have stopped.

wooyeonlee0 · 2024-07-15T12:01:36Z

@comaniac The CI test passed :)

comaniac

LGTM

comaniac · 2024-07-15T16:17:47Z

vllm/spec_decode/spec_decode_worker.py

+        if not self.allow_no_draft_tokens and sum(
+                proposals.proposal_lens) == 0:


I'm a bit worry about the overhead that sum brings, but I feel it's fine for now given that it won't be triggered with draft model TP=1. wdyt @cadedaniel?

yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

cadedaniel · 2024-07-15T18:29:13Z

vllm/spec_decode/spec_decode_worker.py

        """
        self.proposer_worker = proposer_worker
        self.scorer_worker = scorer_worker
        self.disable_by_batch_size = disable_by_batch_size or float("inf")
        self.spec_decode_sampler = spec_decode_sampler
+        self.allow_no_draft_tokens = allow_zero_draft_token_step


nit: can we mark this private, e.g. _allow_no_draft_tokens? we should have done this for all properties but we missed it

cadedaniel · 2024-07-15T18:33:14Z

vllm/spec_decode/spec_decode_worker.py

+        if not self.allow_no_draft_tokens and sum(
+                proposals.proposal_lens) == 0:


yeah, is it possible to store this field in proposals when we create proposals? that way we don't need an additional CPU-GPU-CPU sync

wooyeonlee0 · 2024-07-18T01:32:46Z

Thanks for the review! I'm gonna handle it right now :)

wooyeonlee0 · 2024-07-19T04:32:33Z

@comaniac
I re-initiated the CI test multiple times, but CI failed in one of the following cases: 'build image' or 'documentation build'.
Link: https://buildkite.com/vllm/ci-aws/builds/5151#0190c4c4-6cde-4a9f-a587-63f29a3b5dbd
Link: https://buildkite.com/vllm/ci-aws/builds/5218#0190c8b9-3ee9-4548-a691-cbe56abcff24

Is there any problem in CI now?

comaniac · 2024-07-19T04:43:51Z

The failure you posted seems random. I'll monitor the current CI run and manually retry failed ones.

wooyeonlee0 · 2024-07-19T08:14:27Z

@cadedaniel @comaniac I've updated the code as your suggestion and I think the code passed the test.
Would you take a look? :)

@comaniac CI has finished. Would you retry the 'documentation-build' test? Thank you!

cadedaniel · 2024-07-19T08:21:54Z

@simon-mo can we get a force merge, doc build seems broken

…-project#6369)

…-project#6369) Signed-off-by: Alvant <[email protected]>

fix it

dadfa82

yapf

ad8390c

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from b9af049 to ad8390c Compare July 12, 2024 09:29

wooyeonlee0 mentioned this pull request Jul 12, 2024

[Bug]: Test_skip_speculation fails in distributed execution #5814

Open

wooyeonlee0 added 4 commits July 12, 2024 21:52

fix

a934b12

allow zero token step for other cases

e93781d

update comment

10e6441

yapf

6edf8fc

comaniac reviewed Jul 12, 2024

View reviewed changes

wooyeonlee0 added 4 commits July 15, 2024 10:10

test_skip_speculation

5398d7a

error on test_skip_spec

a70ccc9

add comment

c4b6f72

yapf

02dc475

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 96ec39e to 02dc475 Compare July 15, 2024 02:11

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Jul 15, 2024

mark. need 4 gpus

c2382b5

comaniac approved these changes Jul 15, 2024

View reviewed changes

cadedaniel reviewed Jul 15, 2024

View reviewed changes

no_proposals flag in SpeculativeProposals

fa1f463

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch 5 times, most recently from 2861e99 to 3876858 Compare July 19, 2024 03:54

mypy yapf

4b338d2

wooyeonlee0 force-pushed the temporal-fix-skip-spec branch from 3876858 to 4b338d2 Compare July 19, 2024 04:33

cadedaniel approved these changes Jul 19, 2024

View reviewed changes

cadedaniel enabled auto-merge (squash) July 19, 2024 08:21

simon-mo merged commit a921e86 into vllm-project:main Jul 19, 2024
70 of 72 checks passed

wooyeonlee0 deleted the temporal-fix-skip-spec branch July 19, 2024 13:16

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

c4ac0f2

…-project#6369)

gnpinkert pushed a commit to gnpinkert/vllm that referenced this pull request Jul 26, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

0ca2cfd

…-project#6369)

cduk pushed a commit to cduk/vllm-pascal that referenced this pull request Aug 6, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

15a84a9

…-project#6369)

Alvant pushed a commit to compressa-ai/vllm that referenced this pull request Oct 26, 2024

[BUGFIX] Raise an error for no draft token case when draft_tp>1 (vllm…

189d288

…-project#6369) Signed-off-by: Alvant <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

wooyeonlee0 commented Jul 12, 2024

wooyeonlee0 commented Jul 12, 2024

comaniac left a comment

wooyeonlee0 commented Jul 15, 2024

wooyeonlee0 commented Jul 15, 2024

comaniac left a comment

comaniac Jul 15, 2024

cadedaniel Jul 15, 2024

cadedaniel Jul 15, 2024

cadedaniel Jul 15, 2024

wooyeonlee0 commented Jul 18, 2024 •

edited

Loading

wooyeonlee0 commented Jul 19, 2024

comaniac commented Jul 19, 2024

wooyeonlee0 commented Jul 19, 2024

cadedaniel commented Jul 19, 2024

		if not self.allow_no_draft_tokens and sum(
		proposals.proposal_lens) == 0:

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

[BUGFIX] Raise an error for no draft token case when draft_tp>1 #6369

Conversation

wooyeonlee0 commented Jul 12, 2024

wooyeonlee0 commented Jul 12, 2024

comaniac left a comment

Choose a reason for hiding this comment

wooyeonlee0 commented Jul 15, 2024

wooyeonlee0 commented Jul 15, 2024

comaniac left a comment

Choose a reason for hiding this comment

comaniac Jul 15, 2024

Choose a reason for hiding this comment

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

cadedaniel Jul 15, 2024

Choose a reason for hiding this comment

wooyeonlee0 commented Jul 18, 2024 • edited Loading

wooyeonlee0 commented Jul 19, 2024

comaniac commented Jul 19, 2024

wooyeonlee0 commented Jul 19, 2024

cadedaniel commented Jul 19, 2024

wooyeonlee0 commented Jul 18, 2024 •

edited

Loading