-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
請問訓練用的程式碼是用哪一套? #58
Comments
感謝助教, 我再接下去問. 我知道概念上, pre-training 要準備的是 corpus, 而supervised fine-tuning 要準備的是 QA pair dataset. 我看網路上大多都在談用TRL 的SFTrainer + QA pair dataset做 supervised fine-tuning. 少有在談 pre-training, https://huggingface.co/docs/trl/sft_trainer#quickstart 我的理解是, 反正訓練LLM就是訓練它產生下一個token, 也是不是可以說, 我可以把一筆一筆預訓練用的字串準備好, 用SFTTrainer訓練, 這樣也就是做pre-training? 請問我這觀念正確嗎? 謝謝您! |
CPT 和 SFT 單純是資料準備 (格式) 不同。 btw |
FYI,twllm v3 cPT 和 sft 就是一起做,然後只學習 assistant-side。 |
感謝助教!! 您的回覆和網上發布的教學videos對我的研究和工作都很有幫助. 謝謝您! |
抱歉助教我再接續請教, 關於split, 我想得到的有: |
只要你的文章是連續的,就可以擺在一起,越長越好。以你的例子應該是 |
請問訓練 (pretraining & fine-tuning)用的程式碼是用哪一套?
Axolotl? Llama-factory? 或是其它呢? (huggingface_trl 似乎不支援 pretraining)
可否分享訓練用的設定檔案? 謝謝!
The text was updated successfully, but these errors were encountered: