-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update on work in progress #72
Comments
32 policy and value head filters looks fine. Now trying 2 epochs per batch. |
Larger 20x256 nets take considerably longer, so still crunching on training. |
Wrong, wrong, wrong. After the better part of 3 weeks training plus a week of testing, the new 256x20 net it is clearly at least 100 Elo worse than the prior best net (at fixed playouts, nevermind equal time). Moreover, I'm not even sure which net is the prior best (256x7, or 128x10, ???). Maybe the 32 head and policy filters are not a good thing. Maybe testeval should be left on. Still think I like LR finder and 1 cycle. Policy and value weights--who knows. Time to stop "ready, fire, aim". So, taking a deep breath and a big step back to start again. This is what can happen when watching the Lc0 project. |
I am currently working on larger NNs (256x20).
Still using supervised PGN file input.
Have disabled "testeval".
Trying 32 policy and value head filters per Leela, and here:
https://medium.com/oracledevs/lessons-from-alpha-zero-part-6-hyperparameter-tuning-b1cfcbe4ca9a
Last "best NN" had 128x10 with 8 policy and 4 value head filters.
Also trying using learning rate finder
https://github.com/surmenok/keras_lr_finder
LR = 0.015 looks good.
Also trying 1cycle LR
https://medium.com/@nachiket.tanksale/finding-good-learning-rate-and-the-one-cycle-policy-7159fe1db5d6
Will also try 2 epochs per batch.
Things take time to run.
In the future will try skipping the first n moves of games, as I would run it with an opening book.
Likewise would like to try with tablebases.
The text was updated successfully, but these errors were encountered: