这是目前看到最全的大模型训练代码 #17

nieallen · 2023-06-07T06:39:40Z

这套代码包含了预训练、rlhf流程，还有lora、qlora技术。真的是很全面了。
但如果可以实现多轮对话构建，比如[q1，a1，q2，a2，q3，a3]，构建成训练样本为：prompt：q1*[IGNORE_INDEX]+a1++q2*[IGNORE_INDEX]+a2++q3*[IGNORE_INDEX]，response: a3
就更好了哈哈

hiyouga · 2023-06-07T06:54:58Z

目前的模型训练支持多轮对话，需要在 dataset_info.json 中指定 history 列。
在多轮对话的训练中，目前普遍采用的方式是

q1 + a1 + q2 + a2 + q3 + a3
[IGNORE] + [IGNORE] + [IGNORE] + [IGNORE] + [IGNORE] + a3

因此目前的实现方式适配多轮对话训练。

nieallen · 2023-06-07T07:18:11Z

目前的模型训练支持多轮对话，需要在 dataset_info.json 中指定 history 列。在多轮对话的训练中，目前普遍采用的方式是
q1 + a1 + q2 + a2 + q3 + a3
[IGNORE] + [IGNORE] + [IGNORE] + [IGNORE] + [IGNORE] + a3
因此目前的实现方式适配多轮对话训练。

多轮语料，每一轮只遮挡q，不遮挡a，会不会更好，让模型学到每一轮的回答，帮助更好做对话

hiyouga · 2023-06-07T07:23:48Z

这可能会破坏掉 BOS 和 EOS 的语义信息，我们不推荐这么做。

hiyouga · 2023-06-07T15:20:11Z

抱歉，我的说法可能有误，我重新参考了 Vicuna 的训练代码，这种方式的确能加速模型在多轮对话上的训练，我们考虑在近期实现类似的功能，感谢你的建议！

nieallen · 2023-06-08T02:14:07Z

抱歉，我的说法可能有误，我重新参考了 Vicuna 的训练代码，这种方式的确能加速模型在多轮对话上的训练，我们考虑在近期实现类似的功能，感谢你的建议！

期待！我lora微调实验，vicuna那种多轮语料构建方式，效果要好于prompt全遮。不知道qlora会不会有变化，估计也会好一些

flaviadeutsch · 2023-06-08T06:39:49Z

期待+1

nieallen · 2023-06-08T08:52:09Z

还有请问后续可以实现RWKV的lora微调吗？RWKV真的很快，感觉是gpt生成速度的两倍。但它不是纯transformers架构，不能用peft做lora训练，没有实现的脚本现在

hiyouga · 2023-06-14T14:30:47Z

在最新的代码 b6faf02 中，我们实现了多轮对话语料的训练。

另外，我们暂时不会考虑加入 RWKV 的微调。

hiyouga added the pending This problem is yet to be addressed label Jun 7, 2023

hiyouga added the enhancement New feature or request label Jun 7, 2023

hiyouga added solved This problem has been already solved and removed pending This problem is yet to be addressed labels Jun 14, 2023

hiyouga closed this as completed Jun 16, 2023

godfly mentioned this issue Aug 17, 2023

大数据量全参数预训练报错、流式读数据报错 #549

Closed

YananSunn mentioned this issue Aug 31, 2023

单节点多卡A100 全量微调 CUDA error: an illegal memory access was encountered #267

Closed

liwenju0 mentioned this issue Sep 18, 2023

when running tokenizer on datasets，program crashed #954

Closed

hiyouga removed the enhancement New feature or request label Feb 6, 2024

Mr-Otaku-Lin mentioned this issue Jun 13, 2024

Qwen2-7B lora训练后推理出错 #4251

Closed

1 task

zhoushaoxiang mentioned this issue Jun 14, 2024

Ascend-D910 训练 RuntimeError: SET StreamOverflowSwitch Failed. #4284

Closed

1 task

ldknight mentioned this issue Jul 2, 2024

glm4在stage==rm微调时评估出现：CUDA error: device-side assert triggered #4646

Closed

1 task

hiennguyennq mentioned this issue Oct 21, 2024

distributed training: using GPU 0 to perform barrier as devices used by this process are currently unknown. #5769

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

这是目前看到最全的大模型训练代码 #17

这是目前看到最全的大模型训练代码 #17

nieallen commented Jun 7, 2023

hiyouga commented Jun 7, 2023

nieallen commented Jun 7, 2023

hiyouga commented Jun 7, 2023

hiyouga commented Jun 7, 2023 •

edited

Loading

nieallen commented Jun 8, 2023

flaviadeutsch commented Jun 8, 2023 •

edited

Loading

nieallen commented Jun 8, 2023

hiyouga commented Jun 14, 2023

这是目前看到最全的大模型训练代码 #17

这是目前看到最全的大模型训练代码 #17

Comments

nieallen commented Jun 7, 2023

hiyouga commented Jun 7, 2023

nieallen commented Jun 7, 2023

hiyouga commented Jun 7, 2023

hiyouga commented Jun 7, 2023 • edited Loading

nieallen commented Jun 8, 2023

flaviadeutsch commented Jun 8, 2023 • edited Loading

nieallen commented Jun 8, 2023

hiyouga commented Jun 14, 2023

hiyouga commented Jun 7, 2023 •

edited

Loading

flaviadeutsch commented Jun 8, 2023 •

edited

Loading