EasyDeL Version 0.0.60
What's Changed
SFTTrainer
is now available.VideoCausalLanguageModelTrainer
is now available.- New models such as Grok-1, Qwen2Moe, Mamba, Rwkv, and Whisper are available.
- MoE models had some speed improvements.
- Training Speed is now 18%~42% faster.
- Normal Attention is now faster by 12%~30% #131 .
- DPOTrainer Bugs Fixed.
- CausalLanguageModelTrainer is now more customizable.
- WANDB logging has improved.
- Performace Mode is added to Training Arguments.
- Model configs pass attributes to PretrainedConfig to prevent override… by @yhavinga in #122
- Ignore token label smooth z loss by @yhavinga in #123
- Time the whole train loop instead of only call to train step function by @yhavinga in #124
- Add save_total_limit argument to delete older checkpoints by @yhavinga in #127
- Add gradient norm logging, fix metric collection on multi-worker setup by @yhavinga in #135
Full Changelog: 0.0.55...0.0.60