-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
建议对deepseek-v2-coder-lite进行sft测试 #342
Comments
您好,试试在微调的时候加一个--reset-attention-mask |
您好,整个SFT系统我们已经升级,辛苦再试试 |
重新尝试过了,问题依然存在,有非常明显的语法错误。 |
相同的数据,qwen2.5-coder-7b训练效果正常,和deepseek差异非常大 |
您好,我们重新校验了下所有的tokenizer发现只有deepseek这个没有添加padding_side='right', 实在抱歉啊,我们通过一个PR修复了下,您看看哈:#370 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
sft训练后,生成代码容易产生明显语法错误,与抽风问题。目前尚未查明原因
The text was updated successfully, but these errors were encountered: