Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of a new Q function #1802

Closed
wants to merge 7 commits into from

Conversation

Rinuell
Copy link

@Rinuell Rinuell commented Dec 3, 2022

Following Crem’s idea in early august to change the Q function after wccc, I used his code to test different functions.
The current Q function is Q = W – L. It doesn’t take into account D. So, between two lines with the same W-L, there is no way to favor the line with the lower D. That might lead Leela into more drawish lines and result into « wasted » opportunities. The goal of the new function is to :

  • Include D in some way in the calculation of Q,
  • Return to the current Q function with a specific set of parameters,
  • At least not lose elo against superior opponent and gain elo against weaker one.

Functions tested

Qtest3 : Q=(W-L) (1-D)^Drawfactor
Qtest3bis : Q=(W-L)(1-D^Drawfactor )
Qtest4 : Q=W^Winfactor-L^Losefactor
Qtest5 : Q=(W^Winfactor-L^Losefactor)(1-D)^Drawfactor
Qtest6 : Q=W^(Winfactor+DrawfactorW×D)-L^(Losefactor+DrawfactorL×D)

To test a function, a tune is performed against lc0 master at first to know if a parameter exists that increases elo. If the tuned parameter shows an elo gain against lc0 master, a second test (and / or tune) is performed against stockfish.

Results

Qtest3, Qtest3bis and Qtest5 were discarded. The results showed a higher drawrate without conclusive elo gains.
Qtest4 barely showed any gains (+5.4 against lc0 master and ≈-0 against sf).
Qtest6 showed good results against lc0 master (from +19.1 (T79) to +24.1 (T80)). A test against SF was not conclusive with the same parameters. A second tune was performed against SF and the result showed no elo loss against sf15 and +16.7 against sf14.
That new Q function was implemented in dag-master. The previous parameters showed no gains. A new tune is currently ongoing with some results showing +12.9 elo gain against sf.
The different results are detailed in #test-results on discord.

Qtest6 parameters :

Based on lc0 master :
Vs lc0 :
WinFactor = 0.66302,
DrawFactorW = 0.68633,
LoseFactor = 0.85416,
DrawFactorL = 1.0

Vs SF :
WinFactor = 0.584075451002713,
DrawFactorW = 1.08833880420444,
LoseFactor = 1.27491899534857,
DrawFactorL = 0.0

Based on dag-master :
WinFactor = 0.46736995442197293,
DrawFactorW = 1.3447065755753542,
LoseFactor = 0.8100584463933771,
DrawFactorL = 1.0148543577433364

Issues :

  • There is no implementation of drawscore,
  • No test with long TC / node count have been performed,
  • The tests showed that a set of parameters working against lc0 master does not work against SF. Does a set tuned against SF work for any engine ?
  • A set of parameters working for lc0 master didn’t for dag-master. A major change in the engine might need new parameters.
  • A test with T74 showed poor results while good results were obtained with T79 and T80. How does other net behave ?

@Naphthalin
Copy link
Contributor

This PR tests various WDL transformations with the goal of having a better Q target for search to optimize. Since this was successfully implemented in #1791 in a more principled way, I'm closing this PR.

If you want to further test this idea, it would need to be done against a new baseline.

@Naphthalin Naphthalin closed this Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants