Stockfish removes classical evaluation

Ras · Post by **Ras** » Sat Jul 15, 2023 2:45 pm

amchess wrote: ↑Sat Jul 15, 2023 9:33 amInstead, a real handicap mode should simulate the thinking system of a certain elo range, which is impossible now for Stockfish.

Isn't it possible by training a network on games of the desired strength range?

amchess · Post by **amchess** » Sat Jul 15, 2023 4:44 pm

Yes. If I'm not mistaken, there is indeed the Maia project in this regard. However, since even with the classical Stockfish function, it is superior to any human being, you just need to turn on or off its positional factors in the linear combination, depending on the level of play. These factors can indeed be mapped with, for example, the well-known Steinitz elements that are constantly refined.

carldaman · Post by **carldaman** » Sun Jul 16, 2023 3:03 am

Long live HCE (especially tunable hand crafted eval)! NNUE is a blackbox, and that's not necessarily a good and useful thing. All it gives is pure strength, but sucks out style.

schahmatist · Post by **schahmatist** » Tue Jul 18, 2023 4:06 am

I think the last Stockfish 16 with classical evaluation (NNUE turned off) could be "tracked" as the strongest engine of its kind:

I did use E. Nemeth compilation of Stockfish 16 without the embedded nnue (removed nnue file, so it's purely HCE) and looks like in blitz it is about 30-40 ELO stronger than the last nnue-less Stockfish 200731. It is still among the top 14-15 engines.

Eduard · Post by **Eduard** » Tue Jul 18, 2023 1:15 pm

There are big changes at Stockfish. Since SF 16, not only has the classic evaluation been removed, new networks have also been trained. The SF 16 network had a dimension of 1536. Today's network is twice that size (3072)! It's only a few hours old!

I compiled two versions with external network (avx2 and BMI2), who wants to test it, here is a download:

https://pixeldrain.com/u/SM6WfNkX

It should be remembered that this network is now twice as large as that of SF 16. It probably makes no sense for very old computers.

Yesterday I tested the previous network (dimension 2560) against Stockfish dev 160723 (dimension 2048). I implemented dimension 2560 in my current private engine "Smile".

A quick test at level 10 minutes + 0.1s and Ponder ON (important for me) gave the following result:

The new network was better. I haven't tested the very latest size yet (but soon), but it seems that the development of nets increases the playing strength at higher thinking times!

Test games in PGN:
https://pixeldrain.com/u/u1mFDg4u

Eduard · Post by **Eduard** » Tue Jul 18, 2023 4:25 pm

My private testing has shown that the Dimension 2560 network is stronger than the current SF Dimension 2048 network (no bullet) and also stronger than the latest (only test)Dimension 3072.

The latest official SF dev version uses Dimension 2048.

Here is the latest SF 180723 dev with dimension 2560, in my opinion currently the best stockfish:

Download BMI2 and avx2:
https://pixeldrain.com/u/K8R6jnYE

Have fun testing.

syzygy · Post by **syzygy** » Thu Jul 20, 2023 9:27 pm

amchess wrote: ↑Sat Jul 15, 2023 9:33 amThe current stockfish handicap mode is ridiculous: random errors the more frequent the lower the elo. Instead, a real handicap mode should simulate the thinking system of a certain elo range, which is impossible now for Stockfish.

SF's handicap mode never worked by disabling evaluation feature.

I didn't know that anyone ever considered the handcrafted evaluation functions to be "human-like" or to lead to "human-like" play. It was always typical computer play, just getting better and better every year.

If the NNUE evaluation function leads to improved play that violates old strategic principles, then that is great for students of the game, who can start rewriting the text books.

amchess · Post by **amchess** » Fri Jul 21, 2023 8:46 pm

syzygy wrote: ↑Thu Jul 20, 2023 9:27 pm
amchess wrote: ↑Sat Jul 15, 2023 9:33 amThe current stockfish handicap mode is ridiculous: random errors the more frequent the lower the elo. Instead, a real handicap mode should simulate the thinking system of a certain elo range, which is impossible now for Stockfish.
SF's handicap mode never worked by disabling evaluation feature.

I didn't know that anyone ever considered the handcrafted evaluation functions to be "human-like" or to lead to "human-like" play. It was always typical computer play, just getting better and better every year.

If the NNUE evaluation function leads to improved play that violates old strategic principles, then that is great for students of the game, who can start rewriting the text books.

syzygy · Post by **syzygy** » Sat Jul 22, 2023 1:02 am

amchess wrote: ↑Fri Jul 21, 2023 8:46 pm
syzygy wrote: ↑Thu Jul 20, 2023 9:27 pm
amchess wrote: ↑Sat Jul 15, 2023 9:33 amThe current stockfish handicap mode is ridiculous: random errors the more frequent the lower the elo. Instead, a real handicap mode should simulate the thinking system of a certain elo range, which is impossible now for Stockfish.
SF's handicap mode never worked by disabling evaluation feature.

I didn't know that anyone ever considered the handcrafted evaluation functions to be "human-like" or to lead to "human-like" play. It was always typical computer play, just getting better and better every year.

If the NNUE evaluation function leads to improved play that violates old strategic principles, then that is great for students of the game, who can start rewriting the text books.
Classical eval has positional factor with a correspondence with, for example, Steinitz factors:
<< "| Material | " << Term(MATERIAL)
<< "| Imbalance | " << Term(IMBALANCE)
<< "| Pawns | " << Term(PAWN)
<< "| Knights | " << Term(KNIGHT)
<< "| Bishops | " << Term(BISHOP)
<< "| Rooks | " << Term(ROOK)
<< "| Queens | " << Term(QUEEN)
<< "| Mobility | " << Term(MOBILITY)
<< "|King safety | " << Term(KING)
<< "| Threats | " << Term(THREAT)
<< "| Passed | " << Term(PASSED)
<< "| Space | " << Term(SPACE)
<< "| Winnable | " << Term(WINNABLE) (the initiative in human terms)
If you turn on/off those elements based on the playing level, you can simulate a handicapped thinking system

Ok, I understand what you mean. But this is not how SF's handcap system ever worked.

Moreover, a human player can't exploit nnue net simply because too strong for him.
Conversely, if compares it with the classical eval (more or less 200 elo inferior), by difference he can discover interesting patterns.
I know a lot of GM and they agree this method is very useful for them.

But they can still use their old engines. If you don't care about SF's strength (or even dislike it), why upgrade?
And the source code of the old versions remains available, so it can always be recompiled for new CPUs if that is ever needed.

Anyway, if SF's handicap system is indeed broken, then it would be good if someone could figure out a better way. I don't think NNUE prevents that.

amchess · Post by **amchess** » Sat Jul 22, 2023 1:37 am

The problem is just that!
I don't dispute the use of a pure nnue approach if the goal is just to increase elo, but why eliminate the classic evaluation function, simply as an extra possibility, a bit like Stockfish's handicap mode which, although it proved totally inadequate, was kept instead?
At the game force level, it is a simple "if": so basically, no functional change...
Why do I have to go to an old engine to use the classic evaluation function? I can use the same tool...

Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation

Re: Stockfish removes classical evaluation