Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Models caching does not work (sd_checkpoints_limit) #2176

Open
psydok opened this issue Oct 24, 2024 · 12 comments
Open

Models caching does not work (sd_checkpoints_limit) #2176

psydok opened this issue Oct 24, 2024 · 12 comments

Comments

@psydok
Copy link
Contributor

psydok commented Oct 24, 2024

Tested setting sd_checkpoints_keep_in_cpu: false, sd_checkpoints_limit: 3, sd_checkpoint_cache: 3. Nothing worked. Every request for a new model is long.

@psydok
Copy link
Contributor Author

psydok commented Oct 24, 2024

In automatic1111 it worked. Why was it deleted here, but the fields were left?

@Hugs288
Copy link

Hugs288 commented Oct 24, 2024

i think one of the updates broke model caching, it used to be perfect before but now sometimes after i dont generate for a couple mins or run generate after running hires fix it tries to load the whole model from disk again, pretty randomly aswell, idk.

@s4130
Copy link

s4130 commented Oct 25, 2024

I suspect switching models is causing RAM usage to keep increasing, probably because these settings aren't taking effect.

@psydok
Copy link
Contributor Author

psydok commented Oct 25, 2024

I also noticed that if you send {"override_settings":{"sd_model_checkpoint": "flux1-dev-bnb-nf4-v2.safetensors", "forge_preset": "flux", "forge_additional_modules": []}}, but this model is used by default, then Forge still restarts loading checkpoints because of this inference longer than expected.

@altoiddealer
Copy link
Contributor

I also noticed that if you send {"override_settings":{"sd_model_checkpoint": "flux1-dev-bnb-nf4-v2.safetensors", "forge_preset": "flux", "forge_additional_modules": []}}, but this model is used by default, then Forge still restarts loading checkpoints because of this inference longer than expected.

The way override_settings works, is that if a provided settings value is identical to the current stored value, then it is ignored.

With sd_model_checkpoint... you can "set" the value to a wide variety of accepted "checkpoint aliases", and I'm not quite sure at what point this happens, but the value will subsequently change to the "title" returned by the sd-models API endpoint.

So what is happening is you are passing the model_name value which is a valid value, but it is not equal to the current value so it is not ignoring it, it is setting it, and so model params are refreshing, etc.

I found a way to resolve this... will be pushing a PR soon.

@altoiddealer
Copy link
Contributor

altoiddealer commented Oct 25, 2024

@psydok please check out this PR here which resolves the issue you mentioned in your comment here (Not your "main issue"). Works for me - if you get a chance to try it out, please leave a comment there. Thank you.

#2181

@psydok
Copy link
Contributor Author

psydok commented Oct 25, 2024

@altoiddealer Okay, I'll look at PR and test it tomorrow. Thank you for fix!

UPD: It's okay! Thanks! But issue will not close. I would like to restore work of these parameters in Forge: sd_checkpoints_keep_in_cpu: false, sd_checkpoints_limit: 3, sd_checkpoint_cache: 3.

@psydok psydok changed the title Model caching does not work Model caching does not work (sd_checkpoints_limit) Oct 27, 2024
@psydok psydok changed the title Model caching does not work (sd_checkpoints_limit) Models caching does not work (sd_checkpoints_limit) Oct 27, 2024
@psydok
Copy link
Contributor Author

psydok commented Oct 28, 2024

I found commit where fatal changes were made. But the name of the commit does not give any information about why it was done.
@lllyasviel @DenOfEquity Does anyone know if this happened by accident for debug or if there was some kind of mistake that caused something not to work?

@DenOfEquity
Copy link
Collaborator

That's a very old commit, before I was using Forge. Possibly even before Forge was public? Probably caused (or had high potential to cause) issues after backend reworks by complicating memory management, but that's just speculation. Since then the backend is reworked again, with the Flux update.
There's quite a few relics in the code. A good way to check if settings are used is to search the repo: sd_checkpoints_keep_in_cpu, sd_checkpoints_limit, sd_checkpoint_cache are not referenced anywhere, not even in commented out code.

@psydok
Copy link
Contributor Author

psydok commented Nov 13, 2024

@DenOfEquity Thanks for explanation!

Another question has formed in my mind. I'm trying to reconstruct logic, but things have changed lot in forge and there are lot of wrapper classes.
Could you please tell me, maybe you understand what class should be stored in memory to store both flux (~12gb) and some version of sdxl (~8gb) (for example)?
To be able to quickly switch between models. I thought I needed to add the --sd-checkpoint-limit logic to memory_management.py. But I got confused by count of class reinitializations. They seem to be reinitialized all time, even if model_data.forge_hash matches (False - doesn't affect anything).
Either problem is that I'm debugging on very weak gpu (2gb).

what class should be saved and can it be moved to cpu and back somehow gracefully?
I don't think it will work without global changes...

@psydok
Copy link
Contributor Author

psydok commented Nov 13, 2024

I noticed that if you add --always-gpu when starting forge, it seems like checkpoint change doesn't take as long. I don't understand why though? The memory manager clears everything anyway, it seems.

@DenOfEquity
Copy link
Collaborator

I only know what I know as a result of poking around, so my understanding could be completely wrong.
Models are stored in 3 classes: JointTextEncoder, KModel, IntegratedAutoencoderKL. The latter 2 seem to be reused/reinitialised when a new model in loaded. The first doesn't get reused, potentially leading to the memory leak / excess Committed memory problem some users have.
I'd say Forge is fundamentally not designed to keep multiple models loaded anymore. (With modern models barely fitting into typical consumer hardware anyway, it's likely just too much extra complexity for too low value.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants