Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove classical evaluation #4674

Closed
wants to merge 2 commits into from

Conversation

vondele
Copy link
Member

@vondele vondele commented Jul 11, 2023

since the introduction of NNUE (first released with Stockfish 12), we have maintained the classical evaluation as part of SF in frozen form. The idea that this code could lead to further inputs to the NN or search did not materialize. Now, after five releases, this PR removes the classical evaluation from SF. Even though this evaluation is probably the best of its class, it has become unimportant for the engine's strength, and there is little need to maintain this code (roughly 25% of SF) going forward, or to expend resources on trying to improve its integration in the NNUE eval.

Indeed, it had still a very limited use in the current SF, namely for the evaluation of positions that are nearly decided based on material difference, where the speed of the classical evaluation outweights its inaccuracies. This impact on strength is small, roughly 2Elo, and probably decreasing in importance as the TC grows.

Potentially, removal of this code could lead to the development of techniques to have faster, but less accurate NN evaluation, for certain positions.

STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157 Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95

LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96

VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95

Bench: 1590774

vondele added 2 commits July 11, 2023 17:04
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.

Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.

Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.

STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95

LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
 Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96

VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
 Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95

Bench: 1590774
@jhellis3
Copy link
Contributor

Obviously I support this, but can we create a classical branch for posterity or even academic purposes should people feel inclined? One of the original purposes of Stockfish was to serve as an example. There is great deal of knowledge in those files, which should be preserved in an easily accessible way IMHO.

@Disservin
Copy link
Member

Obviously I support this, but can we create a classical branch for posterity or even academic purposes should people feel inclined? One of the original purposes of Stockfish was to serve as an example. There is great deal of knowledge in those files, which should be preserved in an easily accessible way IMHO.

I think a dedicated tag would fit in this case.

@jhellis3
Copy link
Contributor

A branch would be nicer.

@vondele
Copy link
Member Author

vondele commented Jul 11, 2023

There is a tag SF_classical that can be used to get the last classical version of SF.
git switch --detach SF_classical

@jhellis3
Copy link
Contributor

And if someone wants to make a pull request to classical?

@vondele
Copy link
Member Author

vondele commented Jul 11, 2023

it was frozen already, it wouldn't be merged

@jhellis3
Copy link
Contributor

How can a branch that doesn't exist yet be frozen? And who voted on that?

@Disservin
Copy link
Member

Classical was frozen not the non existing branch? From this on classical wont be maintained anymore..

@vondele
Copy link
Member Author

vondele commented Jul 11, 2023

patches to the classical code have been rejected for the last few years, if that's clearer.

@jhellis3
Copy link
Contributor

No, that did not answer my questions.

@mstembera
Copy link
Contributor

We never officially released the strongest classical version. Since it's about to be completely removed IMO it would be nice if we did so now. If we can agree on this, I volunteer to run a small number of tests to figure out exactly what the strongest classical version was along the lines of #3986 (comment). @vondele would you be ok w/ this?

@vondele
Copy link
Member Author

vondele commented Jul 11, 2023

I think there is no point in doing so. Strongest released classical engine is tagged as sf_11, strongest developed version is tagged as SF_classical. If it was stronger at a later point that's a property of search, not of the evaluation.

@mstembera
Copy link
Contributor

Right. The strongest classical is tagged but never released which is what I'm asking for. W/o released binaries it only exists to us devs but not the chess community at large.

@vondele
Copy link
Member Author

vondele commented Jul 11, 2023

I agree SF_classical has never been released. But there is no point in releasing it now, for the typical user this is just a weak engine. For a developer, who might want to have a look at the code, it is easily available.

@vondele vondele added the to be merged Will be merged shortly label Jul 11, 2023
@vondele vondele closed this in af110e0 Jul 11, 2023
Joachim26 pushed a commit to Joachim26/StockfishNPS that referenced this pull request Jul 12, 2023
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.

Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.

Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.

STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95

LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
 Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96

VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
 Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95

closes official-stockfish#4674

Bench: 1444646
Joachim26 pushed a commit to Joachim26/StockfishNPS that referenced this pull request Jul 12, 2023
since the introduction of NNUE (first released with Stockfish 12), we
have maintained the classical evaluation as part of SF in frozen form.
The idea that this code could lead to further inputs to the NN or
search did not materialize. Now, after five releases, this PR removes
the classical evaluation from SF. Even though this evaluation is
probably the best of its class, it has become unimportant for the
engine's strength, and there is little need to maintain this
code (roughly 25% of SF) going forward, or to expend resources on
trying to improve its integration in the NNUE eval.

Indeed, it had still a very limited use in the current SF, namely
for the evaluation of positions that are nearly decided based on
material difference, where the speed of the classical evaluation
outweights its inaccuracies. This impact on strength is small,
roughly 2Elo, and probably decreasing in importance as the TC grows.

Potentially, removal of this code could lead to the development of
techniques to have faster, but less accurate NN evaluation,
for certain positions.

STC
https://tests.stockfishchess.org/tests/view/64a320173ee09aa549c52157
Elo: -2.35 ± 1.1 (95%) LOS: 0.0%
Total: 100000 W: 24916 L: 25592 D: 49492
Ptnml(0-2): 287, 12123, 25841, 11477, 272
nElo: -4.62 ± 2.2 (95%) PairsRatio: 0.95

LTC
https://tests.stockfishchess.org/tests/view/64a320293ee09aa549c5215b
 Elo: -1.74 ± 1.0 (95%) LOS: 0.0%
Total: 100000 W: 25010 L: 25512 D: 49478
Ptnml(0-2): 44, 11069, 28270, 10579, 38
nElo: -3.72 ± 2.2 (95%) PairsRatio: 0.96

VLTC SMP
https://tests.stockfishchess.org/tests/view/64a3207c3ee09aa549c52168
 Elo: -1.70 ± 0.9 (95%) LOS: 0.0%
Total: 100000 W: 25673 L: 26162 D: 48165
Ptnml(0-2): 8, 9455, 31569, 8954, 14
nElo: -3.95 ± 2.2 (95%) PairsRatio: 0.95

closes official-stockfish#4674

Bench: 1444646
@ChessOverflow
Copy link

ChessOverflow commented Jul 17, 2023

Related to discussion #4678 and at @cj5716's request, I'm putting here the positions that the NNUE misevaluates.

  • Base of NNUE evaluation, White is victorious.

  • Base of Classic evaluation, Black is victorious.

Current FEN is 8/p4p1p/5kp1/1p6/8/3b2P1/PPp2P1P/2R3K1 w - - 0 34

image

@cj5716
Copy link
Contributor

cj5716 commented Jul 17, 2023

Related to discussion #4678 and at @cj5716's request, I'm putting here the positions that the NNUE misevaluates.

  • Base of NNUE evaluation, White is victorious.
  • Base of Classic evaluation, Black is victorious.

Current FEN is 8/p4p1p/5kp1/1p6/8/3b2P1/PPp2P1P/2R3K1 w - - 0 34

image

do you have a decisive answer as to which side is winning? How do you know that it is not HCE that misevaluates it?
homefish depth 47 search score reports back -439cp.

@ChessOverflow
Copy link

ChessOverflow commented Jul 17, 2023

do you have a decisive answer as to which side is winning?

Yes.. black is clearly the winner. see my last comment in #4678.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
to be merged Will be merged shortly
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants