Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target clean now removes legacy binary names #8257

Merged
merged 1 commit into from
Jul 2, 2024

Conversation

HanClinto
Copy link
Collaborator

@HanClinto HanClinto commented Jul 2, 2024

Added step to clean target to remove legacy binary names to reduce confusion reported by @crashr, @mirekphd, @oldmanjk, and others when upgrading after #7809.

I don't think this will take care of everything, but maybe this would help a bit -- what do y'all think?

@HanClinto HanClinto force-pushed the clean-legacy-binaries branch 2 times, most recently from d4b48fb to 76eefa5 Compare July 2, 2024 16:19
@HanClinto HanClinto merged commit 3e2618b into ggerganov:master Jul 2, 2024
49 of 52 checks passed
@HanClinto HanClinto deleted the clean-legacy-binaries branch July 2, 2024 17:20
@HanClinto HanClinto restored the clean-legacy-binaries branch July 2, 2024 17:40
@oldgithubman
Copy link

Added step to clean target to remove legacy binary names to reduce confusion reported by @crashr, @mirekphd, @oldmanjk, and others when upgrading after #7809.

I don't think this will take care of everything, but maybe this would help a bit -- what do y'all think?

Thanks for your quick response! I think this is a good step in the right direction. Eventually, maybe an official install/update script would be wise? Something along the lines of ooba?

@HanClinto
Copy link
Collaborator Author

Thanks for your quick response! I think this is a good step in the right direction.

Good to hear, thank you!

Would you be willing to share more details about how you deploy llama.cpp (do you use published Docker images, do you build locally, etc), and the particulars of how the binary name change threw you for a loop (and were difficult to catch / correct)?

Thank you!

Eventually, maybe an official install/update script would be wise? Something along the lines of ooba?

Just to be clear, you're talking about something like the install script that oobabooga has?

@oldgithubman
Copy link

Good to hear, thank you!

You're welcome and thank you!

Would you be willing to share more details about how you deploy llama.cpp (do you use published Docker images, do you build locally, etc), and the particulars of how the binary name change threw you for a loop (and were difficult to catch / correct)?

Happily. Also, please don't hesitate to ask me for any other help you might need. Contrary to popular belief, I'm here to help.
I build llama.cpp locally. I made a script that I run every day before I start working (more of a command, I suppose):

#! /usr/bin/bash

cd ~/llama.cpp ; conda activate llama.cpp ; export GGML_CUDA=1 ; git stash ; git pull ; make -j 32 ; conda update --all ; python convert-hf-to-gguf-update.py [API KEY]

I don't really know what I'm doing, so criticism is appreciated.
The binary name change threw me for a loop because I didn't know about it. I make high-quality quants of bleeding-edge models, for example, the full-sized deepseek chat/coder v2 at f32 (since bf16 cuda isn't supported yet). I start by making high-quality imatrices. For these size models (almost a terabyte), it takes about four days to create an imatrix the way I do it. I try to stay as up-to-date as I can on llama.cpp development (very difficult and the readme is rarely a good source) so I know when I can start making quants that work. So when I read everything's working, I get to work. The problem with this update is I've unknowingly been doing all this work using outdated binaries that don't have the fixes needed and wondering why results are bad, deleting work (many days worth), and writing models off. I only found out when I got a gemma2 not supported error and I knew that wasn't right.

Just to be clear, you're talking about something like the install script that oobabooga has?

That's the initial install script and launcher. An install script might be useful, but I'm referring to the update scripts, like update_wizard_linux.sh, which, presumably, could be designed to avoid a situation like this. There are many possible ways to handle these changes better that wouldn't screw people over. The devs just need to start taking these things more seriously (which it seems like you are and I appreciate)

@HanClinto
Copy link
Collaborator Author

HanClinto commented Jul 3, 2024

Happily. Also, please don't hesitate to ask me for any other help you might need. Contrary to popular belief, I'm here to help.

Wonderful, thank you!

I build llama.cpp locally. I made a script that I run every day before I start working (more of a command, I suppose):

#! /usr/bin/bash

cd ~/llama.cpp ; conda activate llama.cpp ; export GGML_CUDA=1 ; git stash ; git pull ; make -j 32 ; conda update --all ; python convert-hf-to-gguf-update.py [API KEY]

I don't really know what I'm doing, so criticism is appreciated.

This is INCREDIBLY helpful to have included, thank you!

I don't have too many suggestions. git stash is a nice way to clear local changes, and it keeps any changes in your history so that you can recover any unsaved changes you may have.

You could potentially add a step to make clean right before you call git stash, but that's going to cut two ways -- it will help ensure that you always have a clean build, but it will also reduce any benefit you might gain from caching, so your binaries will take a bit longer to generate. Not sure which way is best for you.

The binary name change threw me for a loop because I didn't know about it. I make high-quality quants of bleeding-edge models, for example, the full-sized deepseek chat/coder v2 at f32 (since bf16 cuda isn't supported yet). I start by making high-quality imatrices. For these size models (almost a terabyte), it takes about four days to create an imatrix the way I do it. I try to stay as up-to-date as I can on llama.cpp development (very difficult and the readme is rarely a good source) so I know when I can start making quants that work. So when I read everything's working, I get to work. The problem with this update is I've unknowingly been doing all this work using outdated binaries that don't have the fixes needed and wondering why results are bad, deleting work (many days worth), and writing models off. I only found out when I got a gemma2 not supported error and I knew that wasn't right.

Thanks for explaining this. This was very painful, and I agree with you -- Llama.cpp development moves super quickly, and it's incredibly difficult to keep on top of things. I used to subscribe to all PRs, but I stopped that a while ago, and I don't even try to keep up with everything anymore. It's too much. I don't know how Georgi does it as well as he does.

That said, the speed and willingness to break things is a philosophical choice that cuts both ways. On the one hand, it has freed up the project to accomplish a great many things and remain at the bleeding edge far longer than many other projects. But it's a barrier to adoption in many circles (as you've noted), and I don't know where the best balance to strike is.

I think that there would be space for a volunteer to write a blog / newsletter cataloging the latest updates every X weeks or so - that could be a helpful way for someone to contribute. I know I'm not the person for that, but it might help in these sorts of cases.

But all that said, I'm still interested in reducing pain for these changes as much as we can.

I liked when you said that this PR was a good first step, but it's clear from your explanation that this wouldn't have been enough, and we need to do more. Off the top of my head, I can think of two main approaches we could try.

  1. The first would be to add symlinks / shortcuts to redirect from old binary names to new binary names, so that the name updates are all 100% transparent.

This is not an ideal situation, because it would discourage migration.

  1. The second option is to add a minimalistic program that builds to the old binary names, but simply outputs a friendly error message that says that the binary names have changed, and to please update scripts and routines to use the new filenames.

The second option is only slightly more work, but I think it might be the best everyone's purposes. If there are any suggestions for a third option, I'd love to consider them.

Just to be clear, you're talking about something like the install script that oobabooga has?

That's the initial install script and launcher. An install script might be useful, but I'm referring to the update scripts, like update_wizard_linux.sh, which, presumably, could be designed to avoid a situation like this. There are many possible ways to handle these changes better that wouldn't screw people over. The devs just need to start taking these things more seriously (which it seems like you are and I appreciate)

I guess something like this might be that third option, though I haven't quite wrapped my head about how this would have helped in a binary name change sort of situation. If we add an install script, then we've changed the paradigm for what people should run to update to the newest version, and they could miss this new script as easily as they missed the binary name changes.

Thanks again for your report and for thinking through this with me! I especially appreciate what you're doing with releasing quants of bleeding edge models -- that's a really valuable service!

@oldgithubman
Copy link

Wonderful, thank you!

You're welcome. Thank you!

This is INCREDIBLY helpful to have included, thank you!

You're welcome and thank you again!

You could potentially add a step to make clean right before you call git stash, but that's going to cut two ways -- it will help ensure that you always have a clean build, but it will also reduce any benefit you might gain from caching, so your binaries will take a bit longer to generate. Not sure which way is best for you.

This is incredibly helpful, thank you! I looked up "make clean" and for the benefit of anyone reading this, this is what I found:
"'make clean' at the command line gets rid of your object and executable files. Sometimes the compiler will link or compile files incorrectly and the only way to get a fresh start is to remove all the object and executable files." (quote slightly edited for readability)
Suggestion implemented. I'd much rather have a clean build that possibly saves me days than saving a few minutes to lose days.

Thanks for explaining this.

You're welcome and thank you.

I used to subscribe to all PRs

I subscribe to a bunch of them, but far from all. I seriously doubt the majority of the userbase is as involved as I am. That's why when something hurts me, I understand there's a very good chance the average user got hurt too. This is actually what frustrates me more than anything. It's like I can feel the negative impact on the community as a whole, if that makes sense.

I don't even try to keep up with everything anymore. It's too much.

It truly is more than any single person can reasonably be expected to keep up with. This makes some of the devs' attitudes even more frustrating. If I can't keep up, most people won't be able to either, for many reasons.

I don't know how Georgi does it as well as he does.

Me neither. Some people are just super geniuses with everything else it takes.

That said, the speed and willingness to break things is a philosophical choice that cuts both ways.

This is not lost on me. I think people understand when breaking things is called for. I also know there are many ways to lessen, or even eliminate, the impact these breaks have on the users (and other devs, to be fair). In my opinion (and the opinion of many others, judging by the general sentiment I pick up on reddit), llama.cpp takes it too far too often. Ooba had the same reputation for a while and the end result is the userbase going elsewhere. Users matter.

But it's a barrier to adoption in many circles (as you've noted), and I don't know where the best balance to strike is.

Barriers to adoption are different. If you're upfront about what to expect from a project, no one should get hurt. The problem is when you make it seem like something is easy to use, establish a large userbase (in this case, llama.cpp is basically the de facto standard), and then rug pull. If you want my opinion, I think the main thing that needs to change is attitude. The elitism. Aside from that, I think breaking changes should be handled in a way that, ideally, doesn't actually break things. Usually, that's possible. But the devs too often just don't care. We have decades of software development examples to learn from and I'd argue most of these problems have been long solved. Most things don't need to break these days. And I do mean that while keeping the current pace of progress. Perhaps things might go 5% slower as devs need time to come up with ways to mitigate impact. Fortunately, I think there are solutions, that once implemented, can be effectively permanent and not require further effort.

I think that there would be space for a volunteer to write a blog / newsletter cataloging the latest updates every X weeks or so - that could be a helpful way for someone to contribute. I know I'm not the person for that, but it might help in these sorts of cases.

Respectfully, I don't think that's the right approach. I mean, the readme didn't work.

But all that said, I'm still interested in reducing pain for these changes as much as we can.

And that makes you better than most of the devs I've interacted with.

I liked when you said that this PR was a good first step, but it's clear from your explanation that this wouldn't have been enough, and we need to do more.

I guess what I was really saying is that your attitude is the necessary first step, and it's a huge one. You can't fix a problem if you can't even acknowledge you have one. I don't think you need to do anything, but it would be appreciated.

  1. The first would be to add symlinks / shortcuts to redirect from old binary names to new binary names, so that the name updates are all 100% transparent.

This is not an ideal situation, because it would discourage migration.

Good idea. There are ways to encourage migration that are compatible with this.

  1. The second option is to add a minimalistic program that builds to the old binary names, but simply outputs a friendly error message that says that the binary names have changed, and to please update scripts and routines to use the new filenames.

Another good idea.

I guess something like this might be that third option, though I haven't quite wrapped my head about how this would have helped in a binary name change sort of situation.

I don't know the specifics, but I'm sure it can be done. I think looking to other projects that have solved this problem is a good approach.

If we add an install script, then we've changed the paradigm for what people should run to update to the newest version, and they could miss this new script as easily as they missed the binary name changes.

A (hopefully) one-time change that solves most future changes. You could maybe put notices in the current scripts. I think that's where you'll reach the most people. In general, the program itself, I believe, is the best place for notifications.

Thanks again for your report and for thinking through this with me! I especially appreciate what you're doing with releasing quants of bleeding edge models -- that's a really valuable service!

I haven't released any quants, although I've considered it. I haven't been confident enough to release anything publicly yet. Considering this past month of quants were all untrustworthy due to the currently discussed issue, I think my instincts are correct. Once I'm confident, my upload bandwidth might be a barrier (20 Mbps).
You're welcome and thanks again for everything! You're one of the good ones

@mofosyne mofosyne added the Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix label Jul 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants