You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This commit was created on GitHub.com and signed with GitHub’s verified signature.
Backend updates
4-bit and 8-bit kv cache options have been added to llama.cpp and llamacpp_HF. They reuse the existing --cache_8bit and --cache_4bit flags. Thanks @GodEmperor785 for figuring out what values to pass to llama-cpp-python.
Transformers:
Add eager attention option to make Gemma-2 work correctly (#6188). Thanks @GralchemOz.
Automatically detect bfloat16/float16 precision when loading models in 16-bit precision.
Automatically apply eager attention to models with Gemma2ForCausalLM architecture.
Gemma-2 support: Automatically detect and apply the optimal settings for this model with the two changes above. No need to set --bf16 --use_eager_attention manually.
Automatically obtain the EOT token from Jinja2 templates and add it to the stopping strings, fixing Llama-3-Instruct not stopping. No need to add <eot> to the custom stopping strings anymore.
UI updates
Whisper STT overhaul: this extension has been rewritten, replacing the Gradio microphone component with a custom microphone element that is much more reliable (#6194). Thanks @RandomInternetPreson, @TimStrauven, and @mamei16.
Make the character dropdown menu coexist in the "Chat" tab and the "Parameters > Character" tab, after some people pointed out that moving it entirely to the Chat tab makes it harder to edit characters.
Colors in the light theme have been improved, making it a bit more aesthetic.
Increase the chat area on mobile devices.
Bug fixes
Fix the API request to AUTOMATIC1111 in the sd-api-pictures extension.
Fix a glitch when switching tabs with "Show controls" unchecked in the chat tab and extensions loaded.
Library updates
llama-cpp-python: bump to 0.2.81 (adds Gemma-2 support).
Transformers: bump to 4.42 (adds Gemma-2 support).