Skip to content

EasyDeL Version 0.0.60

Compare
Choose a tag to compare
@erfanzar erfanzar released this 06 Apr 15:50
· 649 commits to main since this release

What's Changed

  • SFTTrainer is now available.
  • VideoCausalLanguageModelTrainer is now available.
  • New models such as Grok-1, Qwen2Moe, Mamba, Rwkv, and Whisper are available.
  • MoE models had some speed improvements.
  • Training Speed is now 18%~42% faster.
  • Normal Attention is now faster by 12%~30% #131 .
  • DPOTrainer Bugs Fixed.
  • CausalLanguageModelTrainer is now more customizable.
  • WANDB logging has improved.
  • Performace Mode is added to Training Arguments.
  • Model configs pass attributes to PretrainedConfig to prevent override… by @yhavinga in #122
  • Ignore token label smooth z loss by @yhavinga in #123
  • Time the whole train loop instead of only call to train step function by @yhavinga in #124
  • Add save_total_limit argument to delete older checkpoints by @yhavinga in #127
  • Add gradient norm logging, fix metric collection on multi-worker setup by @yhavinga in #135

Full Changelog: 0.0.55...0.0.60