-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
测试集的格式 #8
Comments
我在实际使用时,每个label都有一定数量的样本。没有要求每个label数量一样。
因为是向量计算,样本数量过少,对fasttext没有意义。
guanleiming <[email protected]>于2018年12月18日 周二14:13写道:
… [image: default]
<https://user-images.githubusercontent.com/45931460/50135027-a9b0d380-02cd-11e9-9b49-c8653c3a3017.png>
大概就是十来个label吧,但是在每条最后面加上__label__xxxx,如果改成最前面加__label__xxxx这种格式是有效的,但是如果训练集的样本过少就会导致每个label的概率非常平均,就算把label的完整的一模一样的一段进行测试的概率也几乎是平均的,但是样本多起来了之后,测试的概率也变高了,没有那么平均,请问您在做的时候是否会出现这种现象?这种现象是否是样本少导致的过拟合?
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#8 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABZIVG6ZCxqtc7bXbYW75yx6mfP-luslks5u6IdvgaJpZM4ZVwmH>
.
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://upload-images.jianshu.io/upload_images/12081581-70d412eebb570280?imageMogr2/auto-orient/strip%7CimageView2/2/w/323
您看看这样格式的可以吗?不行的话,那测试集的格式就必须是“label,txt”这种格式的吗?
The text was updated successfully, but these errors were encountered: