Not all stop sequences are created equal #534

zacharyblank · 2023-07-20T15:20:31Z

This PR fixes some stop sequences not being matched. When generating and decoding tokens, sometimes a single token will generate the stop sequence plus additional characters. This caused if seq.output_text.endswith(stop_str): not to behave as expected.

For example. If a stop sequence is defined as ", and the model generates "," as a single token, as is the case with EleutherAI/gpt-neox-20b, then the stop sequence will not be detected and generation will not stop.

This is a small PR that, instead of checking only the end of the generated sequence, checks the entire sequence for the stop sequence

zhuohan123

Thank you for your contribution! Can you fix the formatting error with format.sh? In addition, does this change lead to an O(N) string comparison at every iteration, which leads to O(N^2) general complexity? Will this affect the performance of long sequences? Can we somehow just compare stop_str with the newly generated token, instead of the whole sequence?

claudiosv · 2024-03-13T00:49:20Z

I believe #1724 is a dupe of this one. Would like to see this merged though.

hmellor · 2024-03-28T14:10:18Z

@zacharyblank is this PR still necessary? If yes, do you still plan to get it merged?

njhill · 2024-03-28T18:17:10Z

#3672 is a more complete fix for this.

hmellor · 2024-03-28T18:50:09Z

I'll close this in favour of yours @njhill

…on (vllm-project#534) This PR is a follow-up to [https://github.com/HabanaAI/vllm-hpu-extension/pull/40](https://github.com/HabanaAI/vllm-hpu-extension/pull/40). It removes all bucketing logic that was moved to vllm-hpu-extension

fix stop sequence detection

fc84baa

zhuohan123 requested changes Jul 25, 2023

View reviewed changes

zhuohan123 force-pushed the main branch from 3affdce to 0080d83 Compare August 30, 2023 09:26

geeker-smallwhite mentioned this pull request Mar 13, 2024

fix(stop): fix stop when stop words not in end #3366

Closed

hmellor closed this Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not all stop sequences are created equal #534

Not all stop sequences are created equal #534

zacharyblank commented Jul 20, 2023 •

edited

Loading

zhuohan123 left a comment

claudiosv commented Mar 13, 2024

hmellor commented Mar 28, 2024

njhill commented Mar 28, 2024

hmellor commented Mar 28, 2024

Not all stop sequences are created equal #534

Not all stop sequences are created equal #534

Conversation

zacharyblank commented Jul 20, 2023 • edited Loading

zhuohan123 left a comment

Choose a reason for hiding this comment

claudiosv commented Mar 13, 2024

hmellor commented Mar 28, 2024

njhill commented Mar 28, 2024

hmellor commented Mar 28, 2024

zacharyblank commented Jul 20, 2023 •

edited

Loading