-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quantize
: add imatrix and dataset metadata in GGUF
#6658
Conversation
quantize: factorize KV Overrides parsing between common #6656
…pile on some toolchain
We might also add the number of chunks the imatrix was computed with |
@ggerganov, is this general approach relevant ? |
common: free kv override if used after model loading
…ntize/imatrix-metadata
This comment was marked as off-topic.
This comment was marked as off-topic.
@slaren, can you please have a second check and merge it if approved |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also realized that llama_model_quantize_params::kv_overrides
is a pointer to a std::vector
for no reason whatsoever. It would be great if that could be fixed as well.
…ed from a pair of iterators. Co-authored-by: slaren <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should still need to change llama_model_quantize_params::kv_overrides
to be a pointer to llama_model_kv_override
rather than a std::vector
, but it can be done in other PR.
While I appreciate adding this metadata, I think there is a privacy concern here - how about only storing the filename and not the complete path (which might leak sensitive data such as the username). |
Good point. Meanwhile, you can use kv overrides. |
* imatrix: save the dataset file used in the output file * llama: support kv overrides type string string * common: factorize KV Overrides parsing between common and server * quantize: add imatrix n entries and dataset KV metadata quantize: factorize KV Overrides parsing between common ggerganov#6656 * llama: remove kv override str_value initialization as it does not compile on some toolchain * quantize: add imatrix m_last_call as `quantize.imatrix.chunks_count` * quantize: add imatrix filename in KV * llama: add llama_model_kv_override_free * common: add llama_model_kv_override_free common: free kv override if used after model loading * llama: finally move the string KV override value to the stack * llama : minor * no need to add a NUL to the std::vector, std::string can be initialized from a pair of iterators. Co-authored-by: slaren <[email protected]> * kv override: ensure string termination --------- Co-authored-by: Georgi Gerganov <[email protected]> Co-authored-by: slaren <[email protected]>
Context
In the context of:
quantize
: add imatrix and dataset metadata in GGUF #6656Add imatrix related metadata in quantum models.
Changes
Tests
Closes #6656