Unload and reload models on request #471

Brawlence · 2023-03-21T11:13:32Z

An important step towards optimizing running different neural networks in parallel on the same GPU.

The core idea and the usage case is simple: when oobabooga is used alongside other memory hogs like Stable Diffusion (sd-api-pictures extension) or Tortoise-TTS (not yet implemented) this simple unload function leaves a lot more video memory for those other neural networks to work with. Once they finish their jobs, the LLM can be returned back to VRAM.

This is the first one of the possible improvements to #309 memory handling.

Tested on my machine, unloading Pyg-2.7B-8bit is almost instant, loading it back (from the RAM cache) takes ~7 seconds which I consider to be an acceptable delay compared to the image generation itself.

Pyg-6B-8bit is a bit slower but still tolerable.

mastoca · 2023-03-21T16:01:58Z

I like the reloading idea as I've been switching to another model then returning to the updated model.

I'm not sure what the app state would be in an 'unloaded' condition, perhaps we just need the reload implementation?

Brawlence · 2023-03-21T17:04:52Z

Well, the state after unloading the checkpoint would be undetermined. One won't be able to generate a response, yet the generated error is not fatal and one can resume the chat texgen once the model is loaded back in, that much I tested.

The core idea and the usage case is simple: when oobabooga is used alongside other memory hogs like Stable Diffusion (sd-api-pictures extension) or Tortoise-TTS (not yet implemented) this simple unload function leaves a lot more videomemory for those other neural networks to work with. Once they finish their jobs, the LLM can be returned back to VRAM.

Now shows the message in the console when unloading weights. Also reload_model() calls unload_model() first to free the memory so that multiple reloads won't overfill it.

oobabooga · 2023-03-27T02:52:50Z

In the latest gradio version, there is now this circle icon in dropdown menus that unselects the currently selected option. I have modified the PR for using this button to unload the model from memory.

Your buttons were more functional because they allowed the very same model to be reloaded without having to locate it in the dropdown list, but I found that they occupied a lot of space while being a very niche feature. It should still be possible to create unload/reload buttons inside an extension.

Brawlence · 2023-03-27T05:52:27Z

That's a nice way to save space!
Though I'll still need the reload_model() function in server.py, as it would be called in extension which is trying to manage VRAM. I'll just introduce it as part of sd-api-pics update, this will make more sense in context

catboxanon · 2023-05-09T14:24:54Z

The unload_model and more newly added reload_model functions should be added as endpoints to the API extension, I don't think their scope should be limited to just developing extensions from within this UI only. The SD web UI exposing endpoints in it's own web UI is the only reason the SD API extension is possible in the first place.

Brawlence marked this pull request as ready for review March 21, 2023 13:29

Brawlence mentioned this pull request Mar 22, 2023

Extension: Stable Diffusion Api integration #309

Merged

Brawlence added 2 commits March 23, 2023 07:06

Unload and reload models on request

1917b15

Code reuse + indication

483d173

Now shows the message in the console when unloading weights. Also reload_model() calls unload_model() first to free the memory so that multiple reloads won't overfill it.

Brawlence force-pushed the main branch from a9a240e to 483d173 Compare March 23, 2023 04:06

oobabooga added 2 commits March 26, 2023 23:40

Merge branch 'main' into Brawlence-main

e07c9e3

Unload the model using the "Remove all" button

95c97e1

oobabooga merged commit af603a1 into oobabooga:main Mar 27, 2023

Ph0rk0z pushed a commit to Ph0rk0z/text-generation-webui-testing that referenced this pull request Apr 17, 2023

Unload models on request (oobabooga#471 from Brawlence/main)

2565684

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unload and reload models on request #471

Unload and reload models on request #471

Brawlence commented Mar 21, 2023 •

edited

Loading

mastoca commented Mar 21, 2023

Brawlence commented Mar 21, 2023

oobabooga commented Mar 27, 2023

Brawlence commented Mar 27, 2023

catboxanon commented May 9, 2023

Unload and reload models on request #471

Unload and reload models on request #471

Conversation

Brawlence commented Mar 21, 2023 • edited Loading

mastoca commented Mar 21, 2023

Brawlence commented Mar 21, 2023

oobabooga commented Mar 27, 2023

Brawlence commented Mar 27, 2023

catboxanon commented May 9, 2023

Brawlence commented Mar 21, 2023 •

edited

Loading