three-head neural network #1

alreadydone · 2018-06-05T05:52:29Z

Greetings, Chao,

Your recent works on using CNN to enhance MoHex 2.0 and to guide proof search are really interesting! Solving Go hasn't made progress for nearly 10 years, and I wonder if we can do something with the help with neural nets. Good luck to MoHex-CNN at the upcoming ICGA Olympiad!
And thank you for sharing the trained model and self-play games! (In contrast to Facebook ELF OpenGo who released a model but none of the self-play games.)
I am writing this post because I find a new paper on your homepage titled "Three-Head Neural Network Architecture for Monte Carlo Tree Search". Is it some idea similar to leela-zero/leela-zero#1109, or unrelated? Would you mind sending full-text to my email address?
For your information, the author of https://github.com/ggplib/ggp-zero/, @richemslie, is also training models for Hex 11x11 and 13x13 and put his bot on Little Golem playing as gzero_bot.

Sincerely,
Junyan

cgao3 · 2018-06-05T16:03:54Z

Hi Junyan,

Thanks.
I just took a look at the page you shared. I assume you were asking if I use MCTS move value estimates to train the action-value head. The answer is no, that's not what I proposed in the paper.

Thank you for your information, I wasn't aware of gzero yet.

alreadydone · 2018-06-05T20:32:31Z

There were many ideas from the thread, so I asked whether there are something similar to your work. When @jillybob posted the abstract, I saw that the third head is an action-value head, so it is similar to what @gcp suggested in the original post:
value head producing move evaluations for every move
(I don't know if this is standard terminology but FPU in the thread refers to first play urgency, i.e. what one should initialize the value of an unexpanded child node to.)

From your abstract:

To effectively train the newly introduced action-value head on the same game dataset as for two-head nets, we exploit the optimal relations between parent and children nodes for data augmentation and regularization.

So the difference with proposals in that thread is that you don't use any additional data to train the action-value head. People there worried about that a new training data format will not be backward compatible, so your technique seems very attractive. Could you please briefly explain the idea, provide full-text through email, and/or allow people discuss about it openly? If you are unwilling to do so before the ICGA Olympiad, I totally understand.

cgao3 · 2018-06-06T17:59:36Z

yes, you are right. Using my technique, there is no concern for compatibility of data. I've sent you an email.

alreadydone · 2018-06-07T03:54:12Z

Thank you!

alreadydone mentioned this issue Jun 5, 2018

FPU could be a trainable target leela-zero/leela-zero#1109

Open

cgao3 closed this as completed Jun 7, 2018

cgao3 reopened this Jun 9, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

three-head neural network #1

three-head neural network #1

alreadydone commented Jun 5, 2018 •

edited

Loading

cgao3 commented Jun 5, 2018

alreadydone commented Jun 5, 2018

cgao3 commented Jun 6, 2018 •

edited

Loading

alreadydone commented Jun 7, 2018

three-head neural network #1

three-head neural network #1

Comments

alreadydone commented Jun 5, 2018 • edited Loading

cgao3 commented Jun 5, 2018

alreadydone commented Jun 5, 2018

cgao3 commented Jun 6, 2018 • edited Loading

alreadydone commented Jun 7, 2018

alreadydone commented Jun 5, 2018 •

edited

Loading

cgao3 commented Jun 6, 2018 •

edited

Loading