Prodigy is not working well with Stable Diffusion 3.5 Medium training #27

Bocchi-Chan2023 · 2024-11-04T17:46:18Z

I have been trying to train with the stock settings of this optimizer and have not been successful yet. Specifically, it seems that it is not learning nearly as well as it should.

adamW8bit seems to be working with 4e-4 lr

LoganBooker · 2024-11-05T12:14:12Z

Hey Bocchi-Chan2023, are you able to take note of the value of d during training? If it rises very slowly or not at all, you might need to bump up d0 (to say, 1e-5 or 1e-4). I've found sometimes Prodigy just needs to get a larger read of the gradients to start with, otherwise it can take quite a few steps before it finds a good LR, by which point you're already a good portion through training.

For example, here's the results of an SDXL LoRA training, batch 8, with a modified Prodigy that treats each parameter group independently (#20). From left to right, the graphs show the d value for TE1, TE2 and the Unet.

As you can see, both TEs hit a good LR quickly, but the Unet took until steps 200-300 to find a decent LR, and even then it continued to search. I've been experimenting with ways to combat this but haven't been successful so far.

Also double check you're setting the regular LRs to 1 (as the LR is multiplied by d), and I'd also suggest using betas of (0.9,0.99) if you're not already (as suggested here: #8 (comment)). If beta3 is not set explicitly, then beta2 ** 0.5 is used in its place, so beta2 affects more than just the second moment.

Not sure if any of this will help, but sharing my experiences while playing around with the internals.

Bocchi-Chan2023 · 2024-11-05T12:16:39Z

Okay, I will try to record the value of d while adjusting the value of d0

Bocchi-Chan2023 · 2024-11-06T00:21:51Z

I set d0 from 1e-6 to 1e-5 and prodigy seems to be working well!

konstmish · 2024-11-12T15:29:08Z

Thanks for sharing your experience and a especially thanks to @LoganBooker for giving a solution to the problem. Since the problem seems to be solved, I'm closing the issue, but feel free to reopen it if you have more questions. I'll also add a comment on changing d0 in the readme with a link to this discussion.

konstmish closed this as completed Nov 12, 2024

LoganBooker mentioned this issue Nov 15, 2024

fused back pass LoganBooker/prodigy-plus-schedule-free#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prodigy is not working well with Stable Diffusion 3.5 Medium training #27

Prodigy is not working well with Stable Diffusion 3.5 Medium training #27

Bocchi-Chan2023 commented Nov 4, 2024

LoganBooker commented Nov 5, 2024

Bocchi-Chan2023 commented Nov 5, 2024

Bocchi-Chan2023 commented Nov 6, 2024

konstmish commented Nov 12, 2024

Prodigy is not working well with Stable Diffusion 3.5 Medium training #27

Prodigy is not working well with Stable Diffusion 3.5 Medium training #27

Comments

Bocchi-Chan2023 commented Nov 4, 2024

LoganBooker commented Nov 5, 2024

Bocchi-Chan2023 commented Nov 5, 2024

Bocchi-Chan2023 commented Nov 6, 2024

konstmish commented Nov 12, 2024