Stockfish and Tactics

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Werewolf
Posts: 2039
Joined: Thu Sep 18, 2008 10:24 pm

Re: Stockfish and Tactics

Post by Werewolf »

smatovic wrote: Thu Nov 30, 2023 1:12 pm
Werewolf wrote: Thu Nov 30, 2023 12:40 pm ...
Isn't that what I explained in the first post?
....
Ahh, then you already have your answer.

I did not look into SF Cluster code regarding LazyMPI, according to Peter Österlund's post on "Lazy Cluster" there are two different implementations possible.

--
Srdja

Sadly I have no way of verifying Elo except by renting the cluster and playing a 1000 games. My guess is if the search is wider, as suspected, the Elo gain from the additional cores will be pretty low.
Maybe by the time you get to a crazy high number it might reach +50 elo, but I'm guessing.
smatovic
Posts: 3331
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish and Tactics

Post by smatovic »

Werewolf wrote: Fri Dec 01, 2023 10:24 am ...
Sadly I have no way of verifying Elo except by renting the cluster and playing a 1000 games. My guess is if the search is wider, as suspected, the Elo gain from the additional cores will be pretty low.
I think I agree, cos team SF does Elo optimized development via their testing framework.
Werewolf wrote: Fri Dec 01, 2023 10:24 am Maybe by the time you get to a crazy high number it might reach +50 elo, but I'm guessing.
That is the question, how much can SF still gain with standard opening? Idk.

Here a testrun of cluster version against master, but that was pre-NNUE in 2019:
https://github.com/official-stockfish/S ... /pull/1931

--
Srdja
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish and Tactics

Post by Uri Blass »

Ciekce wrote: Tue Nov 28, 2023 10:29 pm sf's development is focused on elo gain, testsuites are completely irrelevant for modern engine dev

if a change makes stockfish """tactically weaker""" but better at actually winning games, that change still gets merged

also depth is a meaningless metric that cannot be compared between engines
They are focused in elo with a biased book from positions that you never get in normal games.

It is not the same as elo and I can imagine that some change can help with a biased book but be counterproductive in normal chess when you never get the positions that you get with a biased book.
smatovic
Posts: 3331
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Stockfish and Tactics

Post by smatovic »

Uri Blass wrote: Fri Dec 01, 2023 1:40 pm
Ciekce wrote: Tue Nov 28, 2023 10:29 pm sf's development is focused on elo gain, testsuites are completely irrelevant for modern engine dev

if a change makes stockfish """tactically weaker""" but better at actually winning games, that change still gets merged

also depth is a meaningless metric that cannot be compared between engines
They are focused in elo with a biased book from positions that you never get in normal games.

It is not the same as elo and I can imagine that some change can help with a biased book but be counterproductive in normal chess when you never get the positions that you get with a biased book.
+1

"We" already started another level of computer chess by using unbalanced opening books in tournaments and testing.

--
Srdja
Ciekce
Posts: 197
Joined: Sun Oct 30, 2022 5:26 pm
Full name: Conor Anstey

Re: Stockfish and Tactics

Post by Ciekce »

Uri Blass wrote: Fri Dec 01, 2023 1:40 pm They are focused in elo with a biased book from positions that you never get in normal games.

It is not the same as elo and I can imagine that some change can help with a biased book but be counterproductive in normal chess when you never get the positions that you get with a biased book.
you can't test changes to strong engines with balanced books, and I am yet to see any evidence to support that unbalanced books harm performance in balanced openings

the book that sf tests with is a collection of *human* openings from real games, so you can of course get them in real games

and as the post above me mentions, most actual testing involves unbalanced books because otherwise you just get a billion draws - modern top engines are too strong to lose with any regularity from drawn positions
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish and Tactics

Post by Uri Blass »

Ciekce wrote: Fri Dec 01, 2023 9:08 pm
Uri Blass wrote: Fri Dec 01, 2023 1:40 pm They are focused in elo with a biased book from positions that you never get in normal games.

It is not the same as elo and I can imagine that some change can help with a biased book but be counterproductive in normal chess when you never get the positions that you get with a biased book.
you can't test changes to strong engines with balanced books, and I am yet to see any evidence to support that unbalanced books harm performance in balanced openings

the book that sf tests with is a collection of *human* openings from real games, so you can of course get them in real games

and as the post above me mentions, most actual testing involves unbalanced books because otherwise you just get a billion draws - modern top engines are too strong to lose with any regularity from drawn positions
1)If modern engines are too strong to lose from drawn positions then it means that chess engines practically solved chess but people disagree about it and claim they did not solve chess.

2)Even if we assume that stockfish weakly solved chess then you can still test stockfish against weaker engines with normal books to see if some patch help to get better results against weaker engines like RubiChess.
syzygy
Posts: 5730
Joined: Tue Feb 28, 2012 11:56 pm

Re: Stockfish and Tactics

Post by syzygy »

Uri Blass wrote: Fri Dec 01, 2023 10:01 pm1)If modern engines are too strong to lose from drawn positions then it means that chess engines practically solved chess but people disagree about it and claim they did not solve chess.
Because "too strong to lose from drawn positions" is 100% unrelated to chess having been solved or not. Even if you meant to say "too strong to lose from the initial position".

It is not necessary to wait a trillion years before someone has solved chess before a developer can decide to test an engine on unbalanced openings.
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish and Tactics

Post by Uri Blass »

syzygy wrote: Sat Dec 02, 2023 12:07 am
Uri Blass wrote: Fri Dec 01, 2023 10:01 pm1)If modern engines are too strong to lose from drawn positions then it means that chess engines practically solved chess but people disagree about it and claim they did not solve chess.
Because "too strong to lose from drawn positions" is 100% unrelated to chess having been solved or not. Even if you meant to say "too strong to lose from the initial position".

It is not necessary to wait a trillion years before someone has solved chess before a developer can decide to test an engine on unbalanced openings.
If the target is to improve strength in chess then I think you need to prove that you achieve the target by testing the engine in chess.

In the past before 29.6.2023 the stockfish team tested changes with unbalanced book but at least did some regression tests also with balanced book to show some smaller improvement with a balanced book(that is the relevant test to support improvement in games from the opening position).

Today I even do not see it and all the tests in the following link are with an unbalanced book.

https://github.com/official-stockfish/S ... sion-Tests

See also Previous Testing Criteria when I see that they used
8moves_v3.pgn opening book (2013-11-09 - 2023-06-29)

I am even not sure if 8moves_v3.pgn include only positions that can happen today in games of top engines with no book.
earlier they used 8moves_GM.pgn opening book (2013-04-10 - 2013-11-01) when the name suggest it is opening that can happen practically in GM games.
Uri Blass
Posts: 10895
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Stockfish and Tactics

Post by Uri Blass »

I decided to download the positions that stockfish is using for tests and looked at the first 2 positions in the epd file.

I find that the first 2 positions happened in human-human games in lichess but I do not see that they happened at the highest level(they are not in the master database of lichess and also not in the lichess database when I look at average rating 2500 and there are more than 8 million of games with average rating 2500 at lichess even if I ignore ultrabullet and bullet).
Frank Quisinsky
Posts: 7053
Joined: Wed Nov 18, 2009 7:16 pm
Location: Gutweiler, Germany
Full name: Frank Quisinsky

Re: Stockfish and Tactics

Post by Frank Quisinsky »

Hi Uri,

I think the secret is not to test all "super strong engines" with unbalanced opening books. The idea was good, but that has nothing to do with real chess.

In my humble opinion there are other things to do:

- create a test with weaker engines (more strong HCE engines). The effect will be better than unbalanced opening books. And the advantage ... enough engines with completly different playing-styles are available.

- to create a test with much smaller balanced opening books. For "FCP-Tourney-2024-MA" I created a new book with balanced FEOBOS positions = Depth = 6-moves ... but the ways to all 500 ECO-codes are possible / open ... and maybe the engines can find super new variants in good known opening systems. I do the same for FCP-Rating-List (ends in the year 2016).

---

Readme from the opening book:

No magic!
feobos-6m-v1.0.bkt (opening book for Shredder GUI by Stefan Meyer-Kahlen)

- collected 40.372 games, 1:0 games (below move number 75 to mate) and 0:1 games (below move number 90 to mate)
games played with feobos-20.1-contempt_3-5_tuned-v2.bkt
games are from FCP-Tourney-2020, FCP-Tourney-2021, FCP-Tourney-2022, KI-Ratings, FCP-Tourney-2024 (round 01-10)
- games with an eval higher as 0,70 (white) / -0,90 (black) deleted.
Tool: universal-pgn-epd-tool v2 by Ferdinand Mosca
- truncate to 6 moves
Tool: truncate by Norm Pollock
- game results set to 1/2:1/2, different other optimizations (have a look in the *.pgn file)

final results = 21.264 lines (without doubles)
I hope the opening book reduced the draws with 2-3%!

---

But the biggest problem is not all the draws!

The biggest problem is the very high move-average for about 25% of the "Super-Strong-Engines" for various reasons ... no syzygy support (and the bad bishop endgames never end with hidden contempt settings). This is not very nice when you watch the games ... different other reasons!

Besides, I am very sure that with longer time controls the differences are clearer and the draw rate is lower.

Best
Frank

PS: The book I will use in the future for testing engines can be found in my FCP-Tourney-2024 download file (with *.pgn):

https://www.amateurschach.de/download/_ ... y-2024.zip
Working since two weeks on new conditions!