Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: function [get_reply_from_output_ids]: #5045

Merged
merged 3 commits into from
Dec 23, 2023
Merged

Conversation

zhangningboo
Copy link
Contributor

Model: Qwen/Qwen-7B-Chat

Fix Error:

Traceback (most recent call last):
  File "/home/user1/fix_bug/text-generation-webui/modules/text_generation.py", line 370, in generate_reply_HF
    new_content = get_reply_from_output_ids(output, state, starting_from=starting_from)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/user1/fix_bug/text-generation-webui/modules/text_generation.py", line 268, in get_reply_from_output_ids
    if (hasattr(shared.tokenizer, 'convert_ids_to_tokens') and len(output_ids) > starting_from and shared.tokenizer.convert_ids_to_tokens(int(output_ids[starting_from])).startswith('')) and not reply.startswith(' '):
                                                                                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: startswith first arg must be bytes or a tuple of bytes, not str
Output generated in 0.90 seconds (1.11 tokens/s, 1 tokens, context 58, seed 731022533)

Checklist:

TypeError: startswith first arg must be bytes or a tuple of bytes, not str
@zhangningboo
Copy link
Contributor Author

image

@oobabooga
Copy link
Owner

Is there a simpler way to do it? That triples the number of lines of an otherwise simple function.

@zhangningboo
Copy link
Contributor Author

Isn’t readability what you pursue when writing code? As an open source project, is it advisable to sacrifice readability in order to reduce the "number of lines" of code?

@oobabooga
Copy link
Owner

The question is if the code can be simplified. You also removed the check for the convert_ids_to_tokens method and named tokens a variable that is just 1 token.

@oobabooga
Copy link
Owner

It should be good now. Thank you for the fix, I wasn't aware that tokenizers could return bytes instead of tokens.

@oobabooga oobabooga merged commit 1b8b61b into oobabooga:dev Dec 23, 2023
@zhangningboo
Copy link
Contributor Author

🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants