-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gguf-py: Refactor and allow reading/modifying existing GGUF files #3981
Merged
monatis
merged 33 commits into
ggerganov:master
from
KerfuffleV2:feat-gguf-py-read-refactor
Nov 11, 2023
Merged
Changes from 9 commits
Commits
Show all changes
33 commits
Select commit
Hold shift + click to select a range
b8c80df
gguf-py: Refactor and add file reading support
KerfuffleV2 8047aa1
Replay changes from #3871
KerfuffleV2 d7688dc
Various type annotation fixes.
KerfuffleV2 a6f5742
sort imports with isort (again)
cebtenzzre ce865b3
Fix missing return statement in add_tensor
KerfuffleV2 f364636
style cleanup with flake8
cebtenzzre f2292fc
fix NamedTuple and Enum usage
cebtenzzre fffdac3
Fix an issue with state init in GGUFReader
KerfuffleV2 b56ed66
Damagage is not a word.
KerfuffleV2 4a5cd69
Clean up gguf-py/examples/modify_gguf.py whitespace
KerfuffleV2 2af29ff
Update gguf-py/examples/modify_gguf.py formatting
KerfuffleV2 855486c
Update gguf-py/gguf/gguf_reader.py type hint
KerfuffleV2 2360aaa
Make examples executable, formatting changes
KerfuffleV2 8e250fe
Add more information to GGUFReader and examples comments
KerfuffleV2 0d0306e
Include a gguf Python package version bump
KerfuffleV2 cc58ad0
Merge branch 'master' into feat-gguf-py-read-refactor
KerfuffleV2 bca0962
Add convert-gguf-endian.py script
KerfuffleV2 233cb07
cleanup
cebtenzzre 5738b2f
gguf-py : bump minor version
cebtenzzre 52bdc7e
Reorganize scripts
KerfuffleV2 a04f048
Make GGUFReader endian detection less arbitrary
KerfuffleV2 bd241db
Add JSON dumping support to gguf-dump.py
KerfuffleV2 382f975
A few for gguf-dump.py cleanups
KerfuffleV2 7d3580d
Murder accidental tuple in gguf-py/scripts/gguf-dump.py
KerfuffleV2 5608cd8
cleanup
cebtenzzre 795dc0f
constants : remove unneeded type annotations
cebtenzzre a21e9e7
fix python 3.8 compat
cebtenzzre eff662d
Set up gguf- scripts in pyproject.toml
KerfuffleV2 0b0e726
And include scripts/__init__.py, derp
KerfuffleV2 960f912
convert.py: We can't currently support Q8_0 on big endian.
KerfuffleV2 9ce51b6
gguf-py: SpecialVocab: Always try available sources for special token…
KerfuffleV2 f22b2f2
cleanup
cebtenzzre 4814b4b
Promote add_X_token to GGUF metadata for BOS and EOS
KerfuffleV2 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
#!/usr/bin/env python3 | ||
import sys | ||
from pathlib import Path | ||
|
||
# Necessary to load the local gguf package | ||
sys.path.insert(0, str(Path(__file__).parent.parent)) | ||
|
||
from gguf import GGUFReader, GGUFValueType # noqa: E402 | ||
|
||
def dump_gguf(filename: str) -> None: | ||
print(f'* Loading: {filename}') | ||
reader = GGUFReader(filename, 'r') | ||
print(f'\n* Dumping {len(reader.fields)} key/value pair(s)') | ||
for n, field in enumerate(reader.fields.values(), 1): | ||
if not field.types: | ||
pretty_type = 'N/A' | ||
elif field.types[0] == GGUFValueType.ARRAY: | ||
nest_count = len(field.types) - 1 | ||
pretty_type = '[' * nest_count + str(field.types[-1].name) + ']' * nest_count | ||
else: | ||
pretty_type = str(field.types[-1].name) | ||
print(f' {n:5}: {pretty_type:10} | {len(field.data):8} | {field.name}', end = '') | ||
if len(field.types) == 1: | ||
curr_type = field.types[0] | ||
if curr_type == GGUFValueType.STRING: | ||
print(' = {0}'.format(repr(str(bytes(field.parts[-1]), encoding='utf8')[:60])), end = '') | ||
elif field.types[0] in reader.gguf_scalar_to_np: | ||
print(' = {0}'.format(field.parts[-1][0]), end = '') | ||
print() | ||
|
||
print(f'\n* Dumping {len(reader.tensors)} tensor(s)') | ||
for n, tensor in enumerate(reader.tensors, 1): | ||
|
||
prettydims = ', '.join('{0:5}'.format(d) for d in list(tensor.shape) + [1] * (4 - len(tensor.shape))) | ||
print(f' {n:5}: {tensor.n_elements:10} | {prettydims} | {tensor.tensor_type.name:7} | {tensor.name}') | ||
|
||
if __name__ == '__main__': | ||
if len(sys.argv) < 2: | ||
monatis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
print('dump_gguf: Error: Specify an input file', file = sys.stderr) | ||
sys.exit(1) | ||
dump_gguf(sys.argv[1]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,41 @@ | ||
#!/usr/bin/env python3 | ||
import sys | ||
from pathlib import Path | ||
|
||
# Necessary to load the local gguf package | ||
sys.path.insert(0, str(Path(__file__).parent.parent)) | ||
|
||
from gguf import GGUFReader # noqa: E402 | ||
KerfuffleV2 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
def change_gguf(reader: GGUFReader, key: str, value: str) -> None: | ||
field = reader.get_field(key) | ||
if field is None: | ||
print(f'! Field {repr(key)} not found', file = sys.stderr) | ||
sys.exit(1) | ||
|
||
handler = reader.gguf_scalar_to_np.get(field.types[0]) if field.types else None | ||
if handler is None: | ||
print(f'! Field {repr(key)} has unsupported type: {field.types}') | ||
sys.exit(1) | ||
current_value = field.parts[field.data[0]][0] | ||
new_value = handler(value) | ||
print(f'* Preparing to change field {repr(key)} from {current_value} to {new_value}') | ||
if current_value == new_value: | ||
print(f'- Key {repr(key)} already set to requested value {current_value}') | ||
sys.exit(0) | ||
print('*** Warning *** Warning *** Warning **') | ||
print('* Changing fields in a GGUF file can damage it. If you are positive then type YES:') | ||
monatis marked this conversation as resolved.
Show resolved
Hide resolved
|
||
response = input('YES, I am sure> ') | ||
if response != 'YES': | ||
print("You didn't enter YES. Okay then, see ya!") | ||
sys.exit(0) | ||
field.parts[field.data[0]][0] = new_value | ||
print('* Field changed. Successful completion.') | ||
|
||
if __name__ == '__main__': | ||
if len(sys.argv) < 4: | ||
print('modify_gguf: Error: Missing arguments. Syntax: modify_gguf.py <filename> <key> <value>', file = sys.stderr) | ||
KerfuffleV2 marked this conversation as resolved.
Show resolved
Hide resolved
|
||
sys.exit(1) | ||
print(f'* Loading: {sys.argv[1]}') | ||
reader = GGUFReader(sys.argv[1], 'r+') | ||
change_gguf(reader, sys.argv[2], sys.argv[3]) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,37 @@ | ||
#!/usr/bin/env python3 | ||
import sys | ||
from pathlib import Path | ||
|
||
import numpy as np | ||
|
||
# Necessary to load the local gguf package | ||
sys.path.insert(0, str(Path(__file__).parent.parent)) | ||
|
||
from gguf import GGUFWriter # noqa: E402 | ||
|
||
# Example usage: | ||
def writer_example() -> None: | ||
# Example usage with a file | ||
gguf_writer = GGUFWriter("example.gguf", "llama") | ||
|
||
gguf_writer.add_architecture() | ||
gguf_writer.add_block_count(12) | ||
gguf_writer.add_uint32("answer", 42) # Write a 32-bit integer | ||
gguf_writer.add_float32("answer_in_float", 42.0) # Write a 32-bit float | ||
gguf_writer.add_custom_alignment(64) | ||
|
||
tensor1 = np.ones((32,), dtype=np.float32) * 100.0 | ||
tensor2 = np.ones((64,), dtype=np.float32) * 101.0 | ||
tensor3 = np.ones((96,), dtype=np.float32) * 102.0 | ||
|
||
gguf_writer.add_tensor("tensor1", tensor1) | ||
gguf_writer.add_tensor("tensor2", tensor2) | ||
gguf_writer.add_tensor("tensor3", tensor3) | ||
|
||
gguf_writer.write_header_to_file() | ||
gguf_writer.write_kv_data_to_file() | ||
gguf_writer.write_tensors_to_file() | ||
|
||
gguf_writer.close() | ||
|
||
writer_example() | ||
KerfuffleV2 marked this conversation as resolved.
Show resolved
Hide resolved
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,5 @@ | ||
from .gguf import * | ||
from .constants import * | ||
from .gguf_reader import * | ||
from .gguf_writer import * | ||
from .tensor_mapping import * | ||
from .vocab import * |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be two blank lines before and after a top-level function. Same with the other two examples.
Also, the examples should be marked executable - otherwise, the shebang lines don't do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the suggestions (and especially the actual bugs you caught). I really appreciate the time you've spent helping improve this pull!
What are you using for formatting and would you be able to share your configuration? I'd be perfectly happy to turn on Python auto formatting if there's a standard for the Python code in this repo to follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The linters I'm using are:
isort **/*.py -l 120 --tc -m VERTICAL_HANGING_INDENT
)My flake8 configuration is messy, but I've done
pip install wemake-python-styleguide
and then turned off everything I don't care about. This ridiculous command should reproduce the way I'm using flake8 for llama.cpp (most of this is hidden behind a shell alias):There is a lot of subjectivity with flake8, even that command leaves some checks enabled that don't really matter IMO. And normally I leave E251 enabled, but the style in this repo seems to use spaces around '=' in keyword arguments.