-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added Script To Upgrade llamafile Archives #412
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Version Different, repack$ llamafile-upgrade-engine mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile
== Engine Version Check ==
Engine version from mistral-7b-instruct-v0.1-Q4_K_M-server: llamafile v0.4.1
Engine version from /usr/local/bin/llamafile: llamafile v0.8.4
== Repackaging / Upgrading ==
extracting...
Archive: mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile
inflating: /tmp/tmp.FtvmAfSWty/.symtab.amd64
inflating: /tmp/tmp.FtvmAfSWty/.symtab.arm64
inflating: /tmp/tmp.FtvmAfSWty/llamafile/compcap.cu
inflating: /tmp/tmp.FtvmAfSWty/llamafile/llamafile.h
inflating: /tmp/tmp.FtvmAfSWty/llamafile/tinyblas.cu
inflating: /tmp/tmp.FtvmAfSWty/llamafile/tinyblas.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-alloc.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-backend-impl.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-backend.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-cuda.cu
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-cuda.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-impl.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.m
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-metal.metal
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml-quants.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/ggml.h
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/completion.js
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/index.html
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/index.js
inflating: /tmp/tmp.FtvmAfSWty/llama.cpp/server/public/json-schema-to-grammar.mjs
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Anchorage
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Beijing
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Berlin
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Boulder
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Chicago
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/GMT
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/GST
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Honolulu
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Israel
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Japan
extracting: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/London
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/Melbourne
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/New_York
inflating: /tmp/tmp.FtvmAfSWty/usr/share/zoneinfo/UTC
extracting: /tmp/tmp.FtvmAfSWty/.cosmo
extracting: /tmp/tmp.FtvmAfSWty/.args
extracting: /tmp/tmp.FtvmAfSWty/mistral-7b-instruct-v0.1.Q4_K_M.gguf
extracting: /tmp/tmp.FtvmAfSWty/ggml-cuda.dll
repackaging...
== Completed ==
Original File: mistral-7b-instruct-v0.1-Q4_K_M-server.llamafile
Upgraded File: mistral-7b-instruct-v0.1-Q4_K_M-server.updated.llamafile Version same$ llamafile-upgrade-engine test.llamafile
== Engine Version Check ==
Engine version from test: llamafile v0.8.4
Engine version from /usr/local/bin/llamafile: llamafile v0.8.4
Upgrade not required. Exiting... Help Message$ llamafile-upgrade-engine --help
Usage: llamafile-upgrade-engine [OPTION]... <old> (new)
Upgrade llamafile archives.
Options:
-h, --help display this help and exit
-f, --force skip version check
-v, --verbose verbose mode
Arguments:
<old> the name of the old llamafile archive to be upgraded
(new) the name of the new llamafile archive to be created
if not defined output will be <old>.updated.llamafile
Example:
llamafile-upgrade-engine old.llamafile new.llamafile
This command will upgrade the old_llamafile to a new llamafile named new_llamafile.
When you run this program, it's recommended that you've
downloaded or installed an official llamafile-VERSION.zip
from https://github.com/Mozilla-Ocho/llamafile/releases
because they include prebuilt DLLs for CUDA and ROCm.
You can verify your llamafile has them w/ unzip -vl
|
@jart . This is ready whenever now. I recall you also mentioned about build systems hooks, but not sure if relevant here. (edit: Ah i see what you mean. Added another commit) |
jart
approved these changes
May 13, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Thank you!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Context: #411
Porting https://briankhuu.com/blog/2024/04/06/inplace-upgrading-of-llamafiles-engine-bash-script/ to llamafile