v1.10.1

oobabooga released this 13 Jul 17:56

· 223 commits to main since this release

0315122

Library updates

FlashAttention: bump to v2.6.1. Now Gemma-2 works in ExLlamaV2 with FlashAttention without any quality loss.

Bug fixes

Fix for MacOS users encountering model load errors with llama.cpp (#6227). Thanks @InvectorGator.

Contributors

InvectorGator

Assets 2