Skip to content

Releases: OpenRLHF/OpenRLHF

Release v0.4.4

04 Oct 05:35
Compare
Choose a tag to compare

Highlights

  • Support packing_samples for ppo with ray by @zhuzilin in #449 (1.5 ~ 2x performance)

What's Changed

New Contributors

Full Changelog: v0.4.3...v0.4.4

Release v0.4.3

22 Sep 02:11
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.2...v0.4.3

Release v0.4.2

29 Aug 11:39
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.4.1...v0.4.2

Release v0.4.1

07 Aug 11:37
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.4.0...v0.4.1

Release v0.4.0

31 Jul 07:40
Compare
Choose a tag to compare

Changes

Release v0.3.8

24 Jul 10:38
a534764
Compare
Choose a tag to compare

Changes

  • Default to using torch.cuda.device_count() for tp_size in batch_inference @tongyx361
  • Improved description of tqdm @tongyx361
  • Fixed loading dataset from local text files @tongyx361
  • Added support for Llama3.1 @xiaoxigua999
  • Added --packing_samples support for all HF models (SFT/DPO/RM training) @xiaoxigua999
  • Added --nll_loss_coef (for chosen response) support for DPO @xiaoxigua999

Release v0.3.7

20 Jul 11:01
854e572
Compare
Choose a tag to compare

Changes

  • Added support for --packing_samples in DPO/RM training (@xiaoxigua999)
  • Updated reward_dataset to correctly handle prompt_key (@Nickydusk)
  • Updated versions of Transformers and DeepSpeed (@openllmai0)

Release v0.3.6

16 Jul 01:46
Compare
Choose a tag to compare

Changes

  • Refactored the parser.parse_args() and added --train_split and --test_split @openllmai0
  • Added support for running with openrlhf.cli.train_ppo as the module name @openllmai0
  • Fixed PyPI workflows (you can now use pip install openrlhf) @hijkzzz

Release v0.3.5

13 Jul 23:26
Compare
Choose a tag to compare

Changes

Release v0.3.4

08 Jul 21:52
Compare
Choose a tag to compare

Changes