Skip to content

Commit

Permalink
readme.md dataset Table Formatting (#3219)
Browse files Browse the repository at this point in the history
Fix markdown Table Formatting.
  • Loading branch information
dongs0104 authored May 26, 2023
1 parent dcda41c commit a478cd1
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions model/model_training/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,10 +212,9 @@ deepspeed trainer_sft.py --configs defaults your-model-name --deepspeed
Here is an uncomplete overview of datasets for sft:

<!-- prettier-ignore -->
<!-- prettier-ignore-start -->
dataset_name | train_counts | eval_counts | total_counts
----------------------------------------------------------------

<!-- prettier-ignore -->
--|--|--|--
joke | 301 | 76 | 377
webgpt | 14251 | 3563 | 17814
gpt4all | 313552 | 78388 | 391940
Expand All @@ -233,6 +232,7 @@ prosocial_dialogue | 157160 | 26983 | 184143
explain_prosocial | 360708 | 61248 | 421956
soda | 924102 | 231026 | 1155128
oa_leet10k | 18728 | 4683 | 23411
<!-- prettier-ignore-end -->

This list can be generated with the following command, but beware that this
downloads all available datasets (>100GB):
Expand Down

0 comments on commit a478cd1

Please sign in to comment.