-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prove mated-in scores. #4976
Prove mated-in scores. #4976
Conversation
b68346b
to
b878698
Compare
This fixes the issue that Stockfish can output non-proven mated scores if the search has been prematurely stopped with Time control or Nodes searched before exploring other possibilities that the mated score could have been delayed or refuted. The fix also replaces staving off from proven mated scores in multithread environment. Issue reported on mate tracker repo by and this PR is co-authored with @robertnurnberg Special thanks to @AndyGrant for outlining that a fix is eventually possible. Passed Adj off SMP STC: https://tests.stockfishchess.org/tests/view/659d8c1b79aa8af82b967bc9 LLR: 2.95 (-2.94,2.94) <-1.75,0.25> Total: 143064 W: 35747 L: 35647 D: 71670 Ptnml(0-2): 182, 16500, 38091, 16554, 205 Passed Adj off SMP LTC: https://tests.stockfishchess.org/tests/view/659e3b6179aa8af82b968c9e LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 106678 W: 26449 L: 26318 D: 53911 Ptnml(0-2): 24, 11410, 30344, 11533, 28 Passed all tests in mate tracker without any better mate for opponent found in 1t and multithreads. bench: 1219824 Co-Authored-By: Robert Nürnberg <[email protected]>
b878698
to
555ab8d
Compare
For completeness, here some results from the matetracker repo. First, a check if incorrect mated-ins are reported, with the help of this branch. > ./test_engine.sh
Running ./master on matedtrack.epd with --threads 1
--nodes 10 done
--nodes 100 done
--nodes 1000 done
Found mate #-2 (better) for FEN "k7/n1RN4/B7/8/K7/4n3/8/8 b - -".
PV: a7c6 d7b6 a8b8 c7b7
Found mate #-2 (better) for FEN "2b5/2N2p2/6p1/4N3/1B1k1p1r/Kp6/n7/3Q4 b - -".
PV: d4e5 d1d5 e5f6 c7e8
--nodes 10000 done
Found mate #-6 (better) for FEN "8/8/8/4p3/5B2/6p1/7k/4KBN1 b - -".
PV: h2h1 f4g3 e5e4 g3f4 h1g1 f4e3 g1h2 e1f2 h2h1 f1g2 h1h2 e3f4
--nodes 100000 done
Found mate #-4 (better) for FEN "5Kn1/5pQ1/7p/8/4NB1p/p2B3p/qpNp2p1/2rrb1kb b - -".
PV: a2d5 g7d4 d5d4 c2d4 c1c8 f8g7 c8f8 d4f3
Found mate #-4 (better) for FEN "q4rbr/2b3N1/ppR2p1p/Nn2kp1B/4p2n/PpPQR2P/1P4P1/2B3K1 b - -".
PV: a8e8 e3e4 f5e4 d3g3 e5d5 c3c4 d5d4 a5b3
Found mate #-5 (better) for FEN "8/3p4/3Pp3/N7/1p2P3/3p1pB1/1ppP1K2/brrk4 b - -".
PV: e6e5 g3e5 b4b3 e5g7 d1d2 a5c4 d2d1 g7e5 d3d2 c4e3
Found mate #-4 (better) for FEN "3r2k1/5pp1/pN2pb2/Q7/1R6/P2q4/5PPP/2r2RK1 w - -".
PV: h2h3 d3f1 g1h2 d8d3 f2f3 d3f3 g2f3 f1f2
Found mate #-5 (better) for FEN "5rk1/pQp3pp/4b3/4p1N1/3P4/6P1/PP2q2P/3R2KR w - -".
PV: g5f3 e6h3 b7b3 g8h8 b3c2 e2f3 d4e5 f3e3 c2f2 e3f2
Found mate #-6 (better) for FEN "r6k/1p2RQ1p/3p4/p1q3r1/8/1P4N1/P1b3PP/4R2K b - -".
PV: c5b5 g3h5 a8g8 f7f6 g5g7 e7g7 b5f1 e1f1 c2b3 g7g8 h8g8 f6g7
Found mate #-6 (better) for FEN "7r/1q3pRp/1n2b3/r2p4/1ppK4/2p5/1bR3B1/7k b - -".
PV: h1g1 g2f3
--nodes 1000000 done
Found mate #-5 (better) for FEN "6R1/2p4K/8/8/7k/3r4/8/6R1 b - -".
PV: d3e3 g8g6 e3e7 h7h6 e7e3 g1g5 e3e5 g5e5 h4h3 e5h5 > ./test_engine.sh
Running ./patch on matedtrack.epd with --threads 1
--nodes 10 done
--nodes 100 done
--nodes 1000 done
--nodes 10000 done
--nodes 100000 done
--nodes 1000000 done Now the same for multiple threads: > ./test_engine.sh
Running ./master on matedtrack.epd with --threads 2
--nodes 10 done
--nodes 100 done
--nodes 1000 done
--nodes 10000 done
--nodes 100000 done
Found mate #-4 (better) for FEN "2b1n3/rn4B1/pp6/1r1N1k1p/4p2K/4p3/8/Q4R2 b - -".
PV: f5g6 a1e5 b5d5 e5e8 g6g7 e8e7 g7g8 f1f8
Found mate #-8 (better) for FEN "1K6/P4P1P/Pp3p2/1P2PP2/5b1N/r6N/7p/7k b - -".
PV: f6e5 a7a8q e5e4 h3f4 a3a4 f7f8q a4c4 h7h8q h1g1 f8g8 g1f2 h8b2 c4c2 b2c2 f2f1 c2e2
--nodes 1000000 done > ./test_engine.sh
Running ./patch on matedtrack.epd with --threads 2
--nodes 10 done
--nodes 100 done
--nodes 1000 done
--nodes 10000 done
--nodes 100000 done
--nodes 1000000 done > ./test_engine.sh
Running ./master on matedtrack.epd with --threads 8
--nodes 10 done
Found mate #-1 (better) for FEN "8/K1p1R3/2B5/1p6/1P6/N1Pk4/7r/1N6 b - -".
PV: h2h1 c6b5
Found mate #-1 (better) for FEN "5Bn1/2p1p3/NpP1k3/4Pp1B/3P4/2K1pP2/6P1/n5b1 b - -".
PV: e3e2 a6c7
Found mate #-1 (better) for FEN "n7/4K3/1p1N1Nb1/B3k3/1pppppPR/7p/2PPPP2/2n3bq b - g3".
PV: h3h2 d6c4
Found mate #-1 (better) for FEN "8/4p2Q/3pP3/3P4/3P4/2K5/pppp4/rrkbb3 b - -".
PV: d1e2 h7c2
Found mate #-1 (better) for FEN "8/3B4/8/2p5/2p3p1/3pK3/2pB2P1/N2k4 b - -".
PV: c4c3 d7g4
Found mate #-2 (better) for FEN "k1b2bR1/Pp6/1P4K1/8/8/8/8/8 b - -".
PV: c8h3 g8f8 h3c8 f8c8
Found mate #-1 (better) for FEN "n2n4/1bp5/q2p4/1r1B4/pr1P4/kQ1K4/P2P4/RR6 b - -".
PV: a4b3 a2b3
Found mate #-2 (better) for FEN "rr5k/RR6/7K/8/8/8/8/8 b - -".
PV: b8b7 a7a8 b7b8 a8b8
Found mate #-1 (better) for FEN "6br/3K1Pr1/5p1q/R1pk3B/RPppppp1/8/2PPPPN1/8 b - b3".
PV: c4c3 a5c5
Found mate #-1 (better) for FEN "6R1/2RP3k/3p4/6Kp/4b3/5p2/4r3/7r b - -".
PV: f3f2 d7d8q
Found mate #-1 (better) for FEN "8/8/8/7k/8/8/2p1p1Rp/2KbR3 b - -".
PV: h2h1q e1h1
Found mate #-1 (better) for FEN "7k/7p/P1P1PNpR/3p1pP1/Kp3pbP/n2p1r2/6r1/6bq b - -".
PV: d3d2 h6h7
Found mate #-1 (better) for FEN "7k/P1pp1P2/4p1pK/q5pp/2b5/8/8/3R3n b - -".
PV: g5g4 f7f8q
Found mate #-1 (better) for FEN "N5R1/rp5q/kp6/1p1p4/1p1p4/1N1P4/1KPP3P/8 b - -".
PV: a7a8 g8a8
Found mate #-1 (better) for FEN "3R4/8/8/1P6/p1R5/B6B/1nPkPK2/br6 b - -".
PV: b2d3 d8d3
Found mate #-2 (better) for FEN "5b2/2n2K1P/6N1/2P2pP1/1B1k1N2/2pP1p2/pP3Pp1/1n3R2 b - -".
PV: c3c2 h7h8q f8g7 h8g7
Found mate #-2 (better) for FEN "8/8/R1Rn4/2p3br/1pp3N1/2p1pQpr/4K1p1/n5kb b - -".
PV: c3c2 a6a1 c2c1q a1c1
Found mate #-1 (better) for FEN "k5nK/1R3p1p/8/p7/3p1B2/2p3pr/n3pbB1/8 b - -".
PV: c3c2 b7f7
Found mate #-1 (better) for FEN "1BB5/1p3K2/6pb/1pR5/3pk3/3ppp2/r2r4/3N1n2 b - -".
PV: e3e2 c8b7
Found mate #-1 (better) for FEN "q1nnQK1k/rp3p2/p3p3/6N1/5rPb/p6R/RPPPPP2/8 b - -".
PV: a6a5 h3h4
Found mate #-1 (better) for FEN "NNkn2RK/1p2b1p1/1p5p/3B2P1/1p3B2/1r1P2p1/8/1b2r3 b - -".
PV: g3g2 a8b6
Found mate #-1 (better) for FEN "6K1/p3p3/5kN1/1b2pP2/pBp3BP/1P6/8/rrb5 b - -".
PV: a4a3 b4e7
Found mate #-1 (better) for FEN "1n1rbN1b/2RpN3/P4r2/p2p4/kp3p2/1p1B3p/1P1P4/n3RK2 b - -".
PV: h3h2 e1a1
Found mate #-2 (better) for FEN "4k3/PR6/5p2/8/8/1K4p1/1pp1p1pq/1b6 b - -".
PV: c2c1q a7a8q c1c8 a8c8
Found mate #-1 (better) for FEN "bbnK1Q2/8/1p1k4/r2P3q/P3P3/Pnp1p1p1/4PN2/8 b - -".
PV: c8e7 f8e7
Found mate #-1 (better) for FEN "7k/4pPpp/8/K2p1PP1/4P1P1/8/Rp2ppp1/7q b - -".
PV: d5d4 f7f8q
Found mate #-1 (better) for FEN "7k/p5pb/3p2B1/1n5Q/1qp5/2p5/4p1K1/4b2n b - -".
PV: c3c2 h5h7
Found mate #-1 (better) for FEN "8/7P/1P5B/2B1Q1n1/3nn2P/1PRn1knR/3nnn1K/2B1nQBn w - -".
PV: b3b4 d2f1
Found mate #-1 (better) for FEN "8/5B2/1K4p1/1P1p4/1Nk1b1p1/PnN1P1P1/1P6/8 b - -".
PV: e4f5 f7d5
--nodes 100 done
--nodes 1000 done
--nodes 10000 done
--nodes 100000 done
Found mate #-5 (better) for FEN "8/3p4/3Pp3/N7/1p2P3/3p1pB1/1ppP1K2/brrk4 b - -".
PV: e6e5 g3e5 b4b3 e5d4 d1d2 a5c4 d2d1 d4e5 d3d2 c4e3
--nodes 1000000 done
Found mate #-4 (better) for FEN "k7/3K4/1Q1p1p2/2pPPPp1/2pPPPp1/1pnrqrnp/4b3/4b3 b - -".
PV: e3e4 d7c8 e4f5 e5e6 f5e6 d5e6 h3h2 b6a6
Found mate #-5 (better) for FEN "8/2q2p2/1r1b3Q/4n3/ppnpkN2/1p1N2pP/6P1/1B2K3 b - -".
PV: e5f3 g2f3 e4f5 f4e6 f7e6 d3e5 d4d3 e5g4 c7e7 b1d3 > ./test_engine.sh
Running ./patch on matedtrack.epd with --threads 8
--nodes 10 done
--nodes 100 done
--nodes 1000 done
--nodes 10000 done
--nodes 100000 done
--nodes 1000000 done
Found mate #-4 (better) for FEN "3N1N1k/1p5P/1Pp2Ppn/4K3/p5p1/4p1Pp/2P1P2P/8 b - -".
PV: c6c5 The last output is a bit unexpected. The position is supposed to be a Edit: So |
Finally, the output of the standard runs for the matetrack repo: > python matecheck.py --engine ./master
Using ./master with --nodes 1000000
Total fens: 6560
Found mates: 3573
Best mates: 2417
Complete PVs: 3411/3573 (95.5%) > python matecheck.py --engine ./patch
Using ./patch with --nodes 1000000
Total fens: 6560
Found mates: 3573
Best mates: 2416
Complete PVs: 3411/3573 (95.5%) > python matecheck.py --epdFile matedtrack.epd --engine ./master
Using ./master with --nodes 1000000
Total fens: 6560
Found mates: 3901
Best mates: 3155
Complete PVs: 3795/3901 (97.3%)
Better mates: 1 > python matecheck.py --epdFile matedtrack.epd --engine ./patch
Using ./patch with --nodes 1000000
Total fens: 6560
Found mates: 3895
Best mates: 3144
Complete PVs: 3790/3895 (97.3%) |
I ran the same script now 25 times with 4 threads and couldn't replicate, could please guide me about the distribution this happens with the same threads count (let it be 8), i.e. could you run the command e.g. 100 time to see how many times this happen?
|
The PR has two problems still |
Was there ever a demonstration of a mated score in a non mated position? We've concerned ourselves with only shorter-than-possible mated scores, in all of our conversations on Discord. Presumably exceptionally rare, since it would have to occur at a low enough depth such that the aspiration windows were still allowing mated-in-x scores to beat alpha. And once you get to some depth, you base the window around the previous scores.... ? |
Yes we have fens where we can report a mated in score in positions we ourselves can mate , The fix at any rate fixes both. It happens even if exceptionally rare.
|
Just an FYI, you are right about how it could happen, beating a too low alpha before ASP window is properly adjusted.. |
See joergoster/Stockfish-old#13 for an example where it happened for Huntsman1. |
This addresses the issue where Stockfish may output non-proven checkmate scores if the search is prematurely halted, either due to a time control or node limit, before it explores other possibilities where the checkmate score could have been delayed or refuted. The fix also replaces staving off from proven mated scores in a multithread environment making use of the threads instead of a negative effect with multithreads (1t was better in proving mated in scores than more threads). Issue reported on mate tracker repo by and this PR is co-authored with @robertnurnberg Special thanks to @AndyGrant for outlining that a fix is eventually possible. Passed Adj off SMP STC: https://tests.stockfishchess.org/tests/view/65a125d779aa8af82b96c3eb LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 303256 W: 75823 L: 75892 D: 151541 Ptnml(0-2): 406, 35269, 80395, 35104, 454 Passed Adj off SMP LTC: https://tests.stockfishchess.org/tests/view/65a37add79aa8af82b96f0f7 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 56056 W: 13951 L: 13770 D: 28335 Ptnml(0-2): 11, 5910, 16002, 6097, 8 Passed all tests in matetrack without any better mate for opponent found in 1t and multithreads. Fixed bugs in #4976 closes #4990 Bench: 1308279 Co-Authored-By: Robert Nürnberg <[email protected]>
This addresses the issue where Stockfish may output non-proven checkmate scores if the search is prematurely halted, either due to a time control or node limit, before it explores other possibilities where the checkmate score could have been delayed or refuted. The fix also replaces staving off from proven mated scores in a multithread environment making use of the threads instead of a negative effect with multithreads (1t was better in proving mated in scores than more threads). Issue reported on mate tracker repo by and this PR is co-authored with @robertnurnberg Special thanks to @AndyGrant for outlining that a fix is eventually possible. Passed Adj off SMP STC: https://tests.stockfishchess.org/tests/view/65a125d779aa8af82b96c3eb LLR: 2.96 (-2.94,2.94) <-1.75,0.25> Total: 303256 W: 75823 L: 75892 D: 151541 Ptnml(0-2): 406, 35269, 80395, 35104, 454 Passed Adj off SMP LTC: https://tests.stockfishchess.org/tests/view/65a37add79aa8af82b96f0f7 LLR: 2.94 (-2.94,2.94) <-1.75,0.25> Total: 56056 W: 13951 L: 13770 D: 28335 Ptnml(0-2): 11, 5910, 16002, 6097, 8 Passed all tests in matetrack without any better mate for opponent found in 1t and multithreads. Fixed bugs in official-stockfish#4976 closes official-stockfish#4990 Bench: 1308279 Co-Authored-By: Robert Nürnberg <[email protected]>
This fixes the issue that Stockfish can output non-proven mated scores if the search has been prematurely stopped with Time control or Nodes searched before exploring other possibilities that the mated score could have been delayed or refuted.
The fix also replaces staving off from proven mated scores in multithread environment.
Issue reported on mate tracker repo by and this PR is co-authored with @robertnurnberg
Special thanks to @AndyGrant for outlining that a fix is eventually possible.
Passed Adj off SMP STC:
https://tests.stockfishchess.org/tests/view/659d8c1b79aa8af82b967bc9
LLR: 2.95 (-2.94,2.94) <-1.75,0.25>
Total: 143064 W: 35747 L: 35647 D: 71670
Ptnml(0-2): 182, 16500, 38091, 16554, 205
Passed Adj off SMP LTC:
https://tests.stockfishchess.org/tests/view/659e3b6179aa8af82b968c9e
LLR: 2.94 (-2.94,2.94) <-1.75,0.25>
Total: 106678 W: 26449 L: 26318 D: 53911
Ptnml(0-2): 24, 11410, 30344, 11533, 28
Passed all tests in mate tracker without any better mate for opponent found in 1t and multithreads.
bench: 1219824