Release EasyDeL Version 0.0.60 · erfanzar/EasyDeL

What's Changed

SFTTrainer is now available.
VideoCausalLanguageModelTrainer is now available.
New models such as Grok-1, Qwen2Moe, Mamba, Rwkv, and Whisper are available.
MoE models had some speed improvements.
Training Speed is now 18%~42% faster.
Normal Attention is now faster by 12%~30% #131 .
DPOTrainer Bugs Fixed.
CausalLanguageModelTrainer is now more customizable.
WANDB logging has improved.
Performace Mode is added to Training Arguments.
Model configs pass attributes to PretrainedConfig to prevent override… by @yhavinga in #122
Ignore token label smooth z loss by @yhavinga in #123
Time the whole train loop instead of only call to train step function by @yhavinga in #124
Add save_total_limit argument to delete older checkpoints by @yhavinga in #127
Add gradient norm logging, fix metric collection on multi-worker setup by @yhavinga in #135

Full Changelog: 0.0.55...0.0.60