Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于效果预期 #5

Open
jingyaogong opened this issue Oct 7, 2024 · 2 comments
Open

关于效果预期 #5

jingyaogong opened this issue Oct 7, 2024 · 2 comments

Comments

@jingyaogong
Copy link
Owner

jingyaogong commented Oct 7, 2024

minimind-v是一个不错的「玩具」用于学习VLM结构和训练流程。

从效果来看,受LLM本身能力的限制,minimind-v目前离「可用生产力」还差得很远,虽然它能粗粒度识别场景和画面对象,但在理解上还存在明显的偏差和幻觉。

在此基础上,期待minimind-v可以发挥其抛砖引玉的作用,成为一块有用的「砖」为大家深入研究铺路

@cqcracked
Copy link

cqcracked commented Oct 7, 2024

如果层数加到80 维度加到1024或者2048 。再加大训练数据量(比如几个T的tokens),其它都不改,效果应该会有明显提升吧?当然minimind也要对应改

@jingyaogong
Copy link
Owner Author

如果层数加到80 维度加到1024或者2048 。再加大训练数据量(比如几个T的tokens),其它都不改,效果应该会有明显提升吧?当然minimind也要对应改

那就是5B左右的语言模型,效果会得到保证

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants