-
-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF #3695
Conversation
The conversion script is, unfortunately, not guaranteed to work with every model. I have made renamed packages of llama-cpp-python 0.1.78 and written code to load GGML models through them: jllllll@4a999e3 ctransformers can still load GGML models, but it is slower than llama-cpp-python and has other issues like not being able to unload models. This can serve as an alternative solution, but is far from ideal. Not sure what TheBloke's plans are as far as converting previous models to GGUF. His latest uploads have been in both GGML and GGUF, so it seems that he still intends to support GGML. |
That's a valid solution. The situation is not good at the moment honestly:
Ideally, I would like to simply convert the 20 GGML models that I have to GGUF and move on, but that may not be possible. |
It seems that
I have made a PR with the commit linked before to be merged into the |
Yes, when llama.cpp merged GGUF they removed There are more gguf https://huggingface.co/models?pipeline_tag=text-generation&sort=trending&search=gguf now. |
Use separate llama-cpp-python packages for GGML support
Yes, I have tested this and confirmed that the hardcoded values are used no matter what value you supply. I think that your idea of moving @berkut1 thank you. The next step will be to parse the GGUF metadata and use that information in the UI (I haven't found how to read a GGUF in Python yet). |
@oobabooga looks like llama-cpp-python just added GGUF support. Maybe as temporary, we can just read console outputs with starting rows For example llama.cpp has some methods https://github.com/ggerganov/llama.cpp/blob/789c8c945a2814e1487e18e68823d9926e3b1454/ggml.h#L1856 |
Updates llama-cpp-python and deprecates GGML in favor of the new GGUF format.The conversion from old ggml to gguf through convert-llama-ggmlv3-to-gguf.py is not automatic (some command-line flags have to be set manually), so this will be quite messy.Adds GGUF support while keeping GGML support thanks to @jllllll.