Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问18. 在深度神经网络中,引入了隐藏层,放弃了训练问题的凸性,其意义何在 该如何解答? #1

Closed
imhuay opened this issue Jun 7, 2018 · 2 comments

Comments

@imhuay
Copy link

imhuay commented Jun 7, 2018

原书 p119-122 主要在将激活函数,应该怎样回答该问题?

@elviswf
Copy link
Owner

elviswf commented Jun 10, 2018

原书 p119-122讲了几种隐藏层的设计。我个人结合资料回答这个问题:

  1. 加入隐藏层,训练问题不再是一个凸优化问题,放弃凸性意味着神经网络很难训练出最优解。“1993年Blum和Rivest发现的事实更糟:即使一个只有两层和三个节点的简单神经网络的训练优化问题仍然是NP-hard问题。”(http://baijiahao.baidu.com/s?id=1561255903377484&wfr=spider&for=pc)
  2. "幸运的是我们在实践中可以非常高效地接近这些最优结果:通过运行经典的梯度下降优化方法就可以得到足够好的局部最小值。"(http://baijiahao.baidu.com/s?id=1561255903377484&wfr=spider&for=pc)
  3. 意义是增强了模型的学习(或者称拟合)能力,如原书中说“maxout单元可以以任意精度近似任何凸函数”。至于放弃凸性后的优化问题可以在结合工程实践来不断改进。 “似乎传统的优化理论结果是残酷的,但我们可以通过工程方法和数学技巧来尽量规避这些问题,例如启发式方法、增加更多的机器和使用新的硬件(如GPU)。”

@imhuay
Copy link
Author

imhuay commented Jun 11, 2018

非常感谢!!

@imhuay imhuay closed this as completed Jun 11, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants