Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any plans to update models and their quantizations? #44

Open
Calandiel opened this issue Apr 21, 2023 · 0 comments
Open

Any plans to update models and their quantizations? #44

Calandiel opened this issue Apr 21, 2023 · 0 comments

Comments

@Calandiel
Copy link

ggml has support for Q1_O quantization now which was reported to offer better inference quality for some of the models at a cost of slower execution. At the same time, Open Assistant released newer weights for the pythia based model than the ones that are currently being pulled.
Perhaps it'd be worth updating the model on hugginface using the new quantization method?
I would make a PR with it myself but I don't have access to a GPU with enough RAM to quantize the 12B model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant