Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs #1241

Merged
merged 2 commits into from
Jun 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/source/LLM/DPO训练文档.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ cd examples/pytorch/llm
- 如果用带有history的数据训练base模型,需要指定支持多轮对话的template(base模型往往不支持多轮对话),对于这种情况我们默认设置了`chatml`template,你也可以支持--model_type 来选择训练模型的template
- 我们默认在训练时设置`--gradient_checkpointing true`来**节约显存**, 这会略微降低训练速度.
- 如果你使用的是**V100**等较老的GPU, 你需要设置`--dtype AUTO`或者`--dtype fp16`, 因为其不支持bf16.
- 如果你的机器是A100等高性能显卡, 且使用的是qwen系列模型, 推荐你安装[**flash-attn**](https://github.com/Dao-AILab/flash-attention), 这将会加快训练和推理的速度以及显存占用(A10, 3090, V100等显卡不支持flash-attn进行训练). 支持flash-attn的模型可以查看[LLM支持的模型](支持的模型和数据集.md#模型)
- 如果你的机器是A100等高性能显卡, 且使用的是qwen系列模型, 推荐你安装[**flash-attn**](https://github.com/Dao-AILab/flash-attention), 这将会加快训练和推理的速度以及显存占用(3090, V100等显卡不支持flash-attn进行训练). 支持flash-attn的模型可以查看[LLM支持的模型](支持的模型和数据集.md#模型)
- 如果你需要断网进行训练, 请使用`--model_id_or_path <model_dir>`和设置`--check_model_is_latest false`. 具体参数含义请查看[命令行参数](命令行参数.md).
- 如果你想在训练时, 将权重push到ModelScope Hub中, 你需要设置`--push_to_hub true`.

Expand Down
2 changes: 1 addition & 1 deletion docs/source_en/LLM/DPO.md
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ cd examples/pytorch/llm

- We default to setting `--gradient_checkpointing true` during training to **save memory**, which will slightly reduce training speed.
- If you are using older GPUs such as **V100**, you need to set `--dtype AUTO` or `--dtype fp16`, because they do not support bf16.
- If your machine has high-performance graphics cards like A100 and you are using the qwen series models, we recommend installing [**flash-attn**](https://github.com/Dao-AILab/flash-attention), which will speed up training and inference as well as reduce memory usage (A10, 3090, V100, etc. graphics cards do not support training with flash-attn). Models that support flash-attn can be viewed in [LLM Supported Models](Supported-models-datasets.md#models)
- If your machine has high-performance graphics cards like A100 and you are using the qwen series models, we recommend installing [**flash-attn**](https://github.com/Dao-AILab/flash-attention), which will speed up training and inference as well as reduce memory usage (3090, V100, etc. graphics cards do not support training with flash-attn). Models that support flash-attn can be viewed in [LLM Supported Models](Supported-models-datasets.md#models)
- If you need to train offline, please use `--model_id_or_path <model_dir>` and set `--check_model_is_latest false`. For specific parameter meanings, please see [Command Line Arguments](Command-line-parameters.md).
- If you want to push weights to the ModelScope Hub during training, you need to set `--push_to_hub true`.

Expand Down
5 changes: 4 additions & 1 deletion swift/llm/utils/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,7 +269,10 @@ def _try_fetch(self, first_idx: int) -> Optional[Dict[str, Any]]:
idx = np.random.permutation(len(self))[:self.try_fetch_time - 1]
for i in [first_idx] + idx.tolist():
data = self.dataset[i]
res = self.template.encode(data)
try:
res = self.template.encode(data)
except OSError:
continue
if len(res[0]) > 0:
return res

Expand Down
Loading