Skip to content

Commit

Permalink
update READMEs (#3487)
Browse files Browse the repository at this point in the history
fixes #3215
  • Loading branch information
CloseChoice authored Jul 10, 2023
1 parent 2ecd585 commit 6608dc8
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 5 deletions.
6 changes: 3 additions & 3 deletions model/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,8 @@ export SFT_MODEL=$MODEL_PATH/sft_model/$(ls -t $MODEL_PATH/sft_model/ | head -n
5. Train the reward model

```bash
cd ../reward/instructor
python trainer.py configs/deberta-v3-base.yml --output_dir $MODEL_PATH/reward_model
cd model_training
python trainer_rm.py --configs defaults_rm oasst-rm-1-pythia-1b
```

6. Get RM trained model
Expand All @@ -117,7 +117,7 @@ export REWARD_MODEL=$MODEL_PATH/reward_model/$(ls -t $MODEL_PATH/reward_model/ |
7. Train the RL agent

```bash
cd ../../model_training
cd model_training
python trainer_rl.py --configs defaults_rlhf --cache_dir $DATA_PATH --rank_model $REWARD_MODEL --sft_model $SFT_MODEL --output_dir $MODEL_PATH/rl_model
```

Expand Down
9 changes: 7 additions & 2 deletions model/model_training/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,11 +57,16 @@ Currently only these languages are supported via prompt translation:
ar,de,fr,en,it,nl,tr,ru,ms,ko,ja,zh
```

We provide many more datasets for training a list of these can be found in
[here](https://github.com/LAION-AI/Open-Assistant/blob/main/model/model_training/custom_datasets/__init__.py)

## Dataset sub-sampling

We can subsample the **training** data by passing either the `fraction` or
`size` argument in the `configs/config.yml` file. Don't forget the additional
colon ":" after the dataset name when doing this.
`size` argument in the `configs/config.yml` (for RM training
`configs/config_rm.yml` and for RL training `configs/config_rl.yml`
respectively) file. Don't forget the additional colon ":" after the dataset name
when doing this.

Example:

Expand Down

0 comments on commit 6608dc8

Please sign in to comment.