v1.1.1 - bring on the potato models

bghira released this 05 Oct 00:37

· 259 commits to release since this release

Trained with NF4 via PagedLion8Bit.

New custom timestep distribution for Flux via --flux_use_beta_schedule, --flux_beta_schedule_alpha, --flux_beta_schedule_beta (#1023)
The trendy AdEMAMix, its 8bit and paged counterparts are all now available as bnb-ademamix, bnb-ademamix-8bit, and bnb-ademamix8bit-paged`
All low-bit optimisers from Bits n Bytes are now included for NVIDIA and ROCm systems
NF4 training on NVIDIA systems down to 9090M total using Lion8Bit and 512px training at 1.5 sec/iter on a 4090

What's Changed

int8-quanto followup fixes (batch size > 1) by @bghira in #1016
merge by @bghira in #1018
update doc by @bghira in #1019
update docs by @bghira in #1025
Add the ability to use a Beta schedule to select Flux timesteps by @AmericanPresidentJimmyCarter in #1023
AdEMAMix, 8bit Adam/AdamW/Lion/Adagrad, Paged optimisers by @bghira in #1026
Bits n Bytes NF4 training by @bghira in #1028
merge by @bghira in #1029

Full Changelog: v1.1...v1.1.1

Contributors

bghira and AmericanPresidentJimmyCarter

Assets 2

0 Join discussion