I'm disappointed with Stockfish dev.

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Uri Blass
Posts: 10905
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: I'm disappointed with Stockfish dev.

Post by Uri Blass »

Chessqueen wrote: Sat Feb 11, 2023 5:27 am
[pgn][Event "Computer chess game"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2023.02.10"]
[Round "?"]
[White"Stockfish_23020319_x64_avx2 "]
[Black "Stockfish_23020319_x64_avx2 "]
[Result "*"]
[BlackElo "3535"]
[Time "22:26:00"]
[WhiteElo "3535"]
[TimeControl "0+7"]
[SetUp "1"]
[FEN "1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16"]
[Termination "unterminated"]
[PlyCount "0"]
[WhiteType "program"]
[BlackType "human"]



30...Qb5 31.Qf3 Kg7 32.Rxe6 Rxe6 33.Nf5+ Kf7 34.Qh3 Kg8 35.Nxd4 Qe8 36.Nxe6 Qxe6 37.f4 a5 38.f5 Qe8 39.Qh4 Qf7 40.Qf2 Rd8 41.Ra1 b3 42.Rb1 a4 43.cxb3 axb3 44.Qe3 gxf5 45.gxf5 Kh8 46.d4 Qc4 47.Rxb3 Qf1+ 48.Qg1 Qf4 49.Qe1 Rxd4 50.Rb8+ Kg7 51.Qg3+ Qxg3 52.hxg3 Rxe4 53.Rc8 Rc4 54.Rxc7+ Kh6 55.Rc8 Kg5 56.Kg2 Kxf5 57.Kh3 Ke5 58.c6 Kd6 59.c7 Rxc7 60.Ra8 Ke6 61.Kg2 Kf7 62.Kf3 Rc3+ 63.Kf2
= (0.07 --) Depth: 46/66 00:00:35 698MN, tb=47011
The position is equal[/pgn]
Your line is wrong

In 30...Qb5 31.Qf3 Kg7 32.Rxe6?? is a mistake.

correct is 32.e5 Qxa6 33.Qxf6+ Kh6 34.g5+ Kh5 35.Qf3+ Kh4
and white can choose between 36.Qe4+ Kh5 37.h3 and 36.h3
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: I'm disappointed with Stockfish dev.

Post by Chessqueen »

Uri Blass wrote: Sat Feb 11, 2023 6:21 am
Chessqueen wrote: Sat Feb 11, 2023 5:27 am
[pgn][Event "Computer chess game"]
[Site "DESKTOP-OFQ3C0P"]
[Date "2023.02.10"]
[Round "?"]
[White"Stockfish_23020319_x64_avx2 "]
[Black "Stockfish_23020319_x64_avx2 "]
[Result "*"]
[BlackElo "3535"]
[Time "22:26:00"]
[WhiteElo "3535"]
[TimeControl "0+7"]
[SetUp "1"]
[FEN "1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16"]
[Termination "unterminated"]
[PlyCount "0"]
[WhiteType "program"]
[BlackType "human"][/pgn]

Your line is wrong

In 30...Qb5 31.Qf3 Kg7 32.Rxe6?? is a mistake.

correct is 32.e5 Qxa6 33.Qxf6+ Kh6 34.g5+ Kh5 35.Qf3+ Kh4
and white can choose between 36.Qe4+ Kh5 37.h3 and 36.h3
I only placed pgn in front of the first line enclosed with Brackets so the diagram can be shown as requested by Edward :roll:
User avatar
Eelco de Groot
Posts: 4676
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: I'm disappointed with Stockfish dev.

Post by Eelco de Groot »

I read here https://github.com/official-stockfish/S ... 3f680e0677
Created by retraining the master net on a dataset composed of:
* Most of the previous best dataset filtered to remove positions likely having only one good move
* Adding training data from Leela T77 dec2021 rescored with 16tb of 7-piece tablebases
So I think no wonder that Eduard would be disappointed, the builders of this net throw out all the 'good testpositions' with this, singular or critical positions where there is only one good move. So you have to rely totally on your search for those but apparently they are so atypical that it just confuses the net building? I have no real explanation. It is just like throwing out all tactical positions apparently that works too.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
dkappe
Posts: 1632
Joined: Tue Aug 21, 2018 7:52 pm
Full name: Dietrich Kappe

Re: I'm disappointed with Stockfish dev.

Post by dkappe »

Eelco de Groot wrote: Mon Feb 13, 2023 10:51 pm I read here https://github.com/official-stockfish/S ... 3f680e0677
Created by retraining the master net on a dataset composed of:
* Most of the previous best dataset filtered to remove positions likely having only one good move
* Adding training data from Leela T77 dec2021 rescored with 16tb of 7-piece tablebases
So I think no wonder that Eduard would be disappointed, the builders of this net throw out all the 'good testpositions' with this, singular or critical positions where there is only one good move. So you have to rely totally on your search for those but apparently they are so atypical that it just confuses the net building? I have no real explanation. It is just like throwing out all tactical positions apparently that works too.
But the search means that positions many ply deeper are evaluated. Generally, you don’t want to train on positions that poorly predict game outcome. So, for example, anything that would get swept up in a qsearch makes for poor training data. But qsearch is expensive, so lots of heuristics have to be tried. The Stockfish team has been experimenting with all manner of filtering. At least for some time, this will be the way forward.
Fat Titz by Stockfish, the engine with the bodaciously big net. Remember: size matters. If you want to learn more about this engine just google for "Fat Titz".
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: I'm disappointed with Stockfish dev.

Post by Eduard »

There are now 6 new nets from Linrock. And each network is said to be better than the previous network. That's a lot of elos. But: Can someone show me where the progress is based on practical positions? What can the new network do better than the old one (before Linrock)? I can not find it.

If you play Stockfish on the server without a book (I'll do an new experiment) then Stockfish dev will play most of the moves that are in books. I have seen that.

Example: If an opponent plays 1. e4. Here Stockfish will respond with either e6, e5 or c5. After 1...e5 the open defense with Nxe4 or a Marshall attack or a Berlin defense is very likely. After 1..c5 the Najdorf system is very likely to be played.

Aside from consuming a lot of time without a book, Stockfish dev just becomes well-known theory paths, am playing far into the middle game!

The old network can do that just as well as the new one, with the difference that it is tactically stronger. So the new net doesn't bring me any advantage if I play normal chess. Sure, with Bullet 10s+0.1s everything is a different game, especially when strange exotic short variants are to be played.

Just who needs that? In practice nobody!

So I am happy about the progress of some other engines. Komodo Dragon plays so well with my tournament book on PlayChess that he's almost impossible to beat (I don't have Dragon myself, but a friend does). Lc0 plays so strong with my Lc0 tournament book that you can reach well over 3000 serverelo with an RTX 2080, and place 1. Even Ethereal 14 is difficult to defeat. I would not have thought that.

Nice to see these progress on other engines.
carldaman
Posts: 2287
Joined: Sat Jun 02, 2012 2:13 am

Re: I'm disappointed with Stockfish dev.

Post by carldaman »

Hi Eduard,

Which SF NNUE network do you recommend (instead of the linrock ones)?
What network is Charisma using?

Thanks! :D
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: I'm disappointed with Stockfish dev.

Post by Sopel »

Eduard wrote: Tue Feb 14, 2023 8:12 am There are now 6 new nets from Linrock. And each network is said to be better than the previous network. That's a lot of elos. But: Can someone show me where the progress is based on practical positions? What can the new network do better than the old one (before Linrock)? I can not find it.
You willingly deny facts and then ask us to provide more facts. How do we know the new facts will not be denied, too?

edit. Just in case it's unclear. What you're trying to do is make us succumb to selection bias. This forum could use a few lessons about it actually.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
syzygy
Posts: 5780
Joined: Tue Feb 28, 2012 11:56 pm

Re: I'm disappointed with Stockfish dev.

Post by syzygy »

Eduard wrote: Fri Feb 10, 2023 4:15 pmI started cloning because I wasn't satisfied with Stockfish in my analysis.
Then start developing and contributing, not renaming the engine and editing your name into it.
Last edited by syzygy on Sat Feb 18, 2023 6:51 pm, edited 1 time in total.
syzygy
Posts: 5780
Joined: Tue Feb 28, 2012 11:56 pm

Re: I'm disappointed with Stockfish dev.

Post by syzygy »

Eduard wrote: Tue Feb 14, 2023 8:12 amBut: Can someone show me where the progress is based on practical positions? What can the new network do better than the old one (before Linrock)?
It has been known since forever that engine progress cannot be measured on single positions. You need to play games, MANY MANY games.
Uri Blass
Posts: 10905
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: I'm disappointed with Stockfish dev.

Post by Uri Blass »

syzygy wrote: Sat Feb 18, 2023 6:50 pm
Eduard wrote: Tue Feb 14, 2023 8:12 amBut: Can someone show me where the progress is based on practical positions? What can the new network do better than the old one (before Linrock)?
It has been known since forever that engine progress cannot be measured on single positions. You need to play games, MANY MANY games.
I agree that you need to play many games to measure progress but
If the engine play better then it means that there are positions that it play better moves and Eduard asked for these positions.

Tests are only at bullet time control and usually with biased book UHO_XXL_+0.90_+1.19.epd
Testing patches only with biased book means that it is possible that some change is a regression when you play with positions that are evaluated as close to 0.00 and we even do not know it.

There is an improvement even with normal book at bullet time control based on regression tests but Eduard claims that there is no improvement at longer time control than bullet.

I do not know if he is right and the only way to test is by games at longer time control than bullet and with normal book.

Even at bullet a patch that may be no regression from a biased book may be a regression from normal book and I do not understand
the reason that the stockfish developers do not test every patch that pass for no regression with normal book(I can understand not testing progress with normal book because it is logical to think that the rating difference with normal book is smaller).

I think that if they test only patches that pass for no regression with normal book and 8 cores for engine and 60+0.6 time control with 8 cores then it may be better for stockfish's developement.