You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When passing in a prompt via --prompt, the tokenized word piece ids do not seem to match openai/whisper.
This leads to the decoder producing garbage output (probably because it receives combinations of token ids it has never seen before).
I think one way to resolve this would be to port the openai/tiktoken tokenizer encode implementation to whisper.cpp.
Hey there!
When passing in a prompt via
--prompt
, the tokenized word piece ids do not seem to match openai/whisper.This leads to the decoder producing garbage output (probably because it receives combinations of token ids it has never seen before).
I think one way to resolve this would be to port the
openai/tiktoken
tokenizerencode
implementation towhisper.cpp
.whisper.cpp
whisper.cpp/whisper.cpp
Lines 2597 to 2623 in 4774d2f
openai/whisper
The text was updated successfully, but these errors were encountered: