Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backendbench assistant with TC-dependent output. #1546

Merged
merged 5 commits into from
Dec 4, 2022

Conversation

zz4032
Copy link
Contributor

@zz4032 zz4032 commented Mar 15, 2021

My batchsize test matches have shown a strong dependency of optimal Clippy threshold to average nodes/move (results on Discord: https://discord.com/channels/425419482568196106/530486338236055583/821040507077394483). Now the assistant shows 3 proposals for best batchsize for the most common use cases.
Example:

 /  \
 |  |    __________________________________________
 +  +   |                                          |
(@)(@) _| Recommended minibatch-size for this net: |
 |  |  \        1s/move   (Bullet):       68       |
 || |/  |       15s/move  (Rapid):       133       |
 || ||  |       3min/move (Tournament):  272       |
 |\_/|  |__________________________________________|
 \___/

NPS reported by backendbench are used for the calculation of expected nodes/move in games and 3 different thresholds are calculated to determine each of the 3 batchsizes. Threshold formula is also adjusted to account for the fact that real game NPS are higher than backendbench NPS.

Multi-GPU scenario is not considered here (even higher NPS in games compared to backendbench NPS on 1GPU), I'm not sure if this is necessary.

@zz4032
Copy link
Contributor Author

zz4032 commented Mar 20, 2021

backendbench3
Function lacks data at higher nodes/move. I analyzed a few positions and defined one more data point at 200Knodes/move that I think leads to better batchsize selection. The PR status can be changed to "Draft", I'd like to run more position tests.

@zz4032
Copy link
Contributor Author

zz4032 commented Apr 11, 2021

Screenshot from 2021-04-11 21-43-42
Position tests were inconclusive, instead I added another match result.
Trendline is now exponential and looks like the better choice. I added +0.02 to make it converge at 0.02 instead of 0.0: threshold = 0.16947 * exp(-4.1695e-6 * nodes/move) + 0.02. I think the batchsize selection looks good now.

@Naphthalin Naphthalin added rfc Request for comments testing required Feature/bug fix needs more testing. Implies not for merge. labels Nov 2, 2022
@borg323 borg323 self-assigned this Nov 17, 2022
@borg323 borg323 merged commit 98af235 into LeelaChessZero:master Dec 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rfc Request for comments testing required Feature/bug fix needs more testing. Implies not for merge.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants