Implementation of a new Q function #1802

Rinuell · 2022-12-03T15:19:34Z

Following Crem’s idea in early august to change the Q function after wccc, I used his code to test different functions.
The current Q function is Q = W – L. It doesn’t take into account D. So, between two lines with the same W-L, there is no way to favor the line with the lower D. That might lead Leela into more drawish lines and result into « wasted » opportunities. The goal of the new function is to :

Include D in some way in the calculation of Q,
Return to the current Q function with a specific set of parameters,
At least not lose elo against superior opponent and gain elo against weaker one.

Functions tested

Qtest3 : Q=(W-L) (1-D)^Drawfactor
Qtest3bis : Q=(W-L)(1-D^Drawfactor )
Qtest4 : Q=W^Winfactor-L^Losefactor
Qtest5 : Q=(W^Winfactor-L^Losefactor)(1-D)^Drawfactor
Qtest6 : Q=W^(Winfactor+DrawfactorW×D)-L^(Losefactor+DrawfactorL×D)

To test a function, a tune is performed against lc0 master at first to know if a parameter exists that increases elo. If the tuned parameter shows an elo gain against lc0 master, a second test (and / or tune) is performed against stockfish.

Results

Qtest3, Qtest3bis and Qtest5 were discarded. The results showed a higher drawrate without conclusive elo gains.
Qtest4 barely showed any gains (+5.4 against lc0 master and ≈-0 against sf).
Qtest6 showed good results against lc0 master (from +19.1 (T79) to +24.1 (T80)). A test against SF was not conclusive with the same parameters. A second tune was performed against SF and the result showed no elo loss against sf15 and +16.7 against sf14.
That new Q function was implemented in dag-master. The previous parameters showed no gains. A new tune is currently ongoing with some results showing +12.9 elo gain against sf.
The different results are detailed in #test-results on discord.

Qtest6 parameters :

Based on lc0 master :
Vs lc0 :
WinFactor = 0.66302,
DrawFactorW = 0.68633,
LoseFactor = 0.85416,
DrawFactorL = 1.0

Vs SF :
WinFactor = 0.584075451002713,
DrawFactorW = 1.08833880420444,
LoseFactor = 1.27491899534857,
DrawFactorL = 0.0

Based on dag-master :
WinFactor = 0.46736995442197293,
DrawFactorW = 1.3447065755753542,
LoseFactor = 0.8100584463933771,
DrawFactorL = 1.0148543577433364

Issues :

There is no implementation of drawscore,
No test with long TC / node count have been performed,
The tests showed that a set of parameters working against lc0 master does not work against SF. Does a set tuned against SF work for any engine ?
A set of parameters working for lc0 master didn’t for dag-master. A major change in the engine might need new parameters.
A test with T74 showed poor results while good results were obtained with T79 and T80. How does other net behave ?

Naphthalin · 2023-06-06T11:54:55Z

This PR tests various WDL transformations with the goal of having a better Q target for search to optimize. Since this was successfully implemented in #1791 in a more principled way, I'm closing this PR.

If you want to further test this idea, it would need to be done against a new baseline.

Rinuell added 7 commits September 3, 2022 15:06

Q=W^WinFactor - L^LoseFactor

ec54b14

Q=W^(WinFactor + DrawFactor * d) - L^LoseFactor

068ff3e

Update for negative DrawFactorL

b7d271e

Update for negative DrawFactorL

2181a43

Update

69d22bb

Update

0d1ae0c

Update

fa129fc

Naphthalin closed this Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of a new Q function #1802

Implementation of a new Q function #1802

Rinuell commented Dec 3, 2022

Naphthalin commented Jun 6, 2023

Implementation of a new Q function #1802

Implementation of a new Q function #1802

Conversation

Rinuell commented Dec 3, 2022

Functions tested

Results

Qtest6 parameters :

Issues :

Naphthalin commented Jun 6, 2023