[Misc] Include matched stop string/token in responses #2976

njhill · 2024-02-22T01:59:09Z

Currently a finish_reason of "stop" is returned if any of the following are encountered:

One of the provided stop strings
One of the provided stop tokens
The EOS token

It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?).

This PR adds a stop_reason field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred iff finish_reason=="stop" and stop_reason=None.

I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.

Thanks @sahilsuneja1 for adding a test

njhill · 2024-02-26T22:36:47Z

@simon-mo would be good to get your thoughts on this simple addition. It would be useful for us even if it's not exposed in the openai API responses. Also not sure what the best field name is, perhaps stop_match instead of stop_reason?

simon-mo · 2024-02-27T00:23:54Z

I think this is very simple and useful! Naming it as stop_reason is fine, or maybe finish_reason_stop_object: {"kind": "eos"|"stop_string"|"stop_token_id":, "value": None| str | int}

njhill · 2024-02-27T00:49:58Z

Thanks @simon-mo!

Naming it as stop_reason is fine, or maybe finish_reason_stop_object: {"kind": "eos"|"stop_string"|"stop_token_id":, "value": None| str | int}

Will defer to your judgment on this! I like the simplicity of a single field but agree that more explicit could be better.

Include matched stop string/token in responses [Cherry-picked from open upstream PR vllm-project/vllm#2976] Currently a finish_reason of "stop" is returned if any of the following are encountered: - One of the provided stop strings - One of the provided stop tokens - The EOS token It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?). This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None). I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API. Signed-off-by: Joe Runde <[email protected]>

Include matched stop string/token in responses [Cherry-picked from open upstream PR vllm-project/vllm#2976] Currently a finish_reason of "stop" is returned if any of the following are encountered: - One of the provided stop strings - One of the provided stop tokens - The EOS token It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?). This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None). I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API. Signed-off-by: Joe Runde <[email protected]> Signed-off-by: Joe Runde <[email protected]>

Signed-off-by: Joe Runde <[email protected]>

Currently a finish_reason of "stop" is returned if any of the following are encountered: - One of the provided stop strings - One of the provided stop tokens - The EOS token It can be useful to know specifically which of these caused the sequence generation to stop, especially since by default the stop strings/tokens are omitted from the output text (and output token_ids?). This PR adds a "stop_reason" field to the CompletionOutput class which will contain the matched stop string or integer token id. It will be None otherwise, including the EOS token case. This means in particular that EOS can be inferred by (finish_reason=="stop" and stop_reason=None). I've also added to the openai server responses but not sure whether or not this should be included since it isn't part of the official API.

simon-mo · 2024-03-25T20:20:40Z

What you have is totally fine as long as it's well documented.

njhill · 2024-03-25T23:41:42Z

Thanks @simon-mo, I just pushed a small update to add descriptions to the new stop_reason openai response fields in protocol.py, and this is already documented in the CompletionOutput docstring. Is there anywhere else you think we should add some doc for this?

) Co-authored-by: Sahil Suneja <[email protected]>

njhill force-pushed the return-stop-str branch 2 times, most recently from 0700062 to b9c7685 Compare February 22, 2024 15:51

njhill changed the title ~~Include specific stop reason in responses~~ Include matched stop string/token in responses Feb 22, 2024

njhill force-pushed the return-stop-str branch 2 times, most recently from 84c73da to bb6f831 Compare February 26, 2024 22:29

njhill marked this pull request as ready for review February 26, 2024 22:33

njhill force-pushed the return-stop-str branch from c0ad972 to ab8ba66 Compare February 27, 2024 00:19

joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024

Changes pending from vllm-project/vllm#2976

4e936bf

joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 11, 2024

Changes pending from vllm-project/vllm#2976

8c9f28e

Signed-off-by: Joe Runde <[email protected]>

joerunde pushed a commit to IBM/vllm that referenced this pull request Mar 12, 2024

Changes pending from vllm-project/vllm#2976

790513a

Signed-off-by: Joe Runde <[email protected]>

njhill force-pushed the return-stop-str branch 2 times, most recently from 2e129bd to 560f090 Compare March 20, 2024 20:34

njhill changed the title ~~Include matched stop string/token in responses~~ [Misc] Include matched stop string/token in responses Mar 25, 2024

njhill force-pushed the return-stop-str branch from 560f090 to 6e60573 Compare March 25, 2024 16:53

njhill and others added 2 commits March 25, 2024 10:38

test for stop_reason

59073e3

njhill force-pushed the return-stop-str branch from 6e60573 to 59073e3 Compare March 25, 2024 17:38

Add pydantic descriptions to the new openai response fields

00a8f71

simon-mo approved these changes Mar 26, 2024

View reviewed changes

simon-mo merged commit dfeb2ec into vllm-project:main Mar 26, 2024
32 checks passed

xjpang pushed a commit to xjpang/vllm that referenced this pull request Mar 31, 2024

[Misc] Include matched stop string/token in responses (vllm-project#2976

bc3ea46

) Co-authored-by: Sahil Suneja <[email protected]>

njhill mentioned this pull request Apr 11, 2024

[BugFix] Fix handling of stop strings and stop token ids #3672

Merged

njhill deleted the return-stop-str branch April 25, 2024 01:23

dtrifiro mentioned this pull request May 15, 2024

bump ubi base image tag opendatahub-io/vllm#24

Merged

DarkLight1337 mentioned this pull request May 24, 2024

[BUGFIX] [FRONTEND] Correct chat logprobs #5029

Merged

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[Misc] Include matched stop string/token in responses (vllm-project#2976

a303776

) Co-authored-by: Sahil Suneja <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Misc] Include matched stop string/token in responses #2976

[Misc] Include matched stop string/token in responses #2976

njhill commented Feb 22, 2024 •

edited

Loading

njhill commented Feb 26, 2024

simon-mo commented Feb 27, 2024

njhill commented Feb 27, 2024

simon-mo commented Mar 25, 2024

njhill commented Mar 25, 2024

[Misc] Include matched stop string/token in responses #2976

[Misc] Include matched stop string/token in responses #2976

Conversation

njhill commented Feb 22, 2024 • edited Loading

njhill commented Feb 26, 2024

simon-mo commented Feb 27, 2024

njhill commented Feb 27, 2024

simon-mo commented Mar 25, 2024

njhill commented Mar 25, 2024

njhill commented Feb 22, 2024 •

edited

Loading