Stockfish 070324 a Disaster

Guenther · Post by **Guenther** » Mon Mar 11, 2024 12:03 am

Eduard wrote: ↑Sun Mar 10, 2024 8:52 pm Then they should do this, and create their own club of SF programmers
...

Not SF programmers, just programmers ;)

chessica · Post by **chessica** » Mon Mar 11, 2024 1:38 pm

Eduard wrote: ↑Mon Mar 11, 2024 12:00 am
gordonr wrote: ↑Sun Mar 10, 2024 11:39 pm
Eduard wrote: ↑Sun Mar 10, 2024 11:03 pm I'm not ungrateful. But if I have legitimate criticism, I should still be allowed to say so. Who are the SF developers developing the Stockfish engine for? Actually open source for the whole world, right? Because the project is open source. Who will thank me if I point out weaknesses in the engine so that it can improve? By the way, without users everything is useless. If there are no SF users, the project could be closed. It's always a give and take. Or has that changed now and the developers are now the gods to whom you can only say thank you, always thank you?
Legitimate criticism? You talk about wanting an engine for correspondence analysis but yet post an initial test position and complain that it required 48s. Is that really too long for correspondence? Not only that, you fail to highlight that Stockfish will immediately show a winning move for that position (e.g. Bxa7) but you're not happy because it's not the winning move that you want to see. Again, for the purposes of correspondence play, Stockfish is displaying a winning PV from the start. It is not losing a won position.

There are just two examples from my test set that are known to some users, they are well-known positions. There are more positions and I'm not talking about correspondence chess only. But also about human tournament chess. Both.

I'm telling you seriously: I don't want to use this engine to analyze my own tournament games! Never, be sure. If this were the only Stockfish in the world, I would definitely change provider!

Are you really sure you need to post your banter here? Nobody here really cares. complain where you were kicked out. that's since I met you since cs. 40 years like that. This is getting on my nerves!

Eduard · Post by **Eduard** » Mon Mar 11, 2024 4:32 pm

I don't want to be seen as just a chatterbox. I've done a lot of testing in the last few months, not just with analyses. I play on PlayChess and Lichess with my own engines. Some of my friends play with my compiles and even use them for correspondence chess at the highest level! However, I don't enjoy it anymore. Every time I see a new network of developers, I am happy. Unfortunately, in the last two months I almost ALWAYS had to change something so that I could use the new SF engine for myself. But I don't feel like it anymore! I am frustrated!

I did it like this once today so that people wouldn't say I just talk and don't do anything.

I just made a simple attempt to change some parameters so that some positions can be solved better, such as the following. I ONLY changed the following parameters in the code of SF 070324. Because I use the larger network, I implemented the network with dimension 3072.

Network Dimension 3072 "nn-3182bacdfd54.nnue"

evaluate.cpp:
bool smallNet = std::abs(simpleEval) > 1250;
bool psqtOnly = std::abs(simpleEval) > 4600;

In the test of the two positions that I criticized in Stockfish from 070324, the new engine Big-SF 110324 analyzed it like this:

[d]1B1r4/rp2npkp/2b1pbp1/1qp5/nPN1R3/1P1P1QP1/2P2PBP/5R1K w - - 0 1

Analysis by Big-SF 110324-avx2:

1.Qxf6+ Kxf6 2.Be5+ Kg5 3.Bg7 Bxe4 4.f4+ Kh5 5.Bxe4 g5 6.Ne5 Qc6 7.g4+ Kh4 8.Bf6 h6 9.fxg5 Nd5 10.Nxc6 Nxf6 11.Nxd8 Nxe4 12.dxe4 hxg5 13.bxa4 Rxa4 14.bxc5 Rxe4 15.Rxf7 Kxg4 16.Rf2
Depth: 23/41 00:00:02 17809kN, tb=35

[d]1r3rk1/3bbppp/1qn2P2/p2pP1P1/3P4/2PB1N2/6K1/qNBQ1R2 w - - 0 1

Analysis by Big-SF 110324-avx2:

1.Bxh7+ Kxh7 2.Rh1+ Kg8 3.Rh8+ Kxh8 4.Qh1+ Kg8 5.g6 fxg6 6.Ng5 Qbb2+ 7.Bxb2 Qxb2+ 8.Kg3 Qf2+ 9.Kxf2 Rxf6+ 10.exf6 Bxf6 11.Qxd5+ Kh8 12.Nf7+ Kh7 13.Qxd7 Rb2+ 14.Ke3 Rxb1 15.Qxc6
Depth: 22/38 00:00:03 40903kN, tb=409

And what does it look like in the bullet? How much stronger is Stockfish 070324? He's not stronger, he's even weaker! However, not at Bullet 60s, but at level 120s + 1s and not on one core but on 4 cores and Ponder ON (that's how we play in online tournaments), the developers test with Ponder OFF.

Ryzen 3900X with 24 Threads
GUI Fritz 19
8 threads/Engine
all 3456men Syzygy
Hash 256 MB/Engine
Ponder ON
Book EN-Select with color swap

Standing after 94 games:

Ryzen 3900X, Blitz 2.0min+1.0sec 0

Code: Select all

1   Big-SF 110324-avx2      +15  +9/=80/-5 52.13%   49.0/94
2   Stockfish 070324-avx2   -15  +5/=80/-9 47.87%   45.0/94

All games:
https://pixeldrain.com/u/iyMHRtv8

Download BIG-SF 110324 with dimension 3072 (avx2, avx512, bmi2, sse41, source code):
https://pixeldrain.com/u/Zf4gX2wa

Test it yourself! Compiles are with external networks.

Eduard · Post by **Eduard** » Mon Mar 11, 2024 5:04 pm

By the way: I don't want to say that the changed parameters represent the optimum. But it is a first step towards the right path. It can be improved even more.

Peter Berger · Post by **Peter Berger** » Mon Mar 11, 2024 5:10 pm

I don't think you realize how strange all your messages look.

I understand that you are a Stockfish cloner who is serious about his job. To summarize: you claim, that you make a few changes to Stockfish, then it is better at solving testpositions without losing playing strength. Also you excel at doing improvements that are especially good at longer time controls.

How about some evidence that you are actually capable of doing that? You don't have to provide statistical significance (although you seem to have quite a lot of CPU power availlable to you) , just enough data to get serious people interested and involved who'll do that for you.

Instead all we get is useless forum fights and engines I can never even remember their names as they change faster as others change their underwear. So I get frustrated reading your input, although it might be interesting in principle, who knows ..

Eduard · Post by **Eduard** » Mon Mar 11, 2024 5:25 pm

Peter Berger wrote: ↑Mon Mar 11, 2024 5:10 pm I don't think you realize how strange all your messages look.

I understand that you are a Stockfish cloner who is serious about his job. To summarize: you claim, that you make a few changes to Stockfish, then it is better at solving testpositions without losing playing strength. Also you excel at doing improvements that are especially good at longer time controls.

How about some evidence that you are actually capable of doing that? You don't have to provide statistical significance (although you seem to have quite a lot of CPU power availlable to you) , just enough data to get serious people interested and involved who'll do that for you.

Instead all we get is useless forum fights and engines I can never even remember their names as they change faster as others change their underwear. So I get frustrated reading your input, although it might be interesting in principle, who knows ..

What proof, for what should I provide? Is this about me or about engines? I have just presented an engine here that I consider to be better overall than the engine developed by the developers from 070324. Prove to me that this engine is worse. Show me things where Stockfish 070324 is better. Then we compare better + worse together and summarize the entire result. But you obviously don't care about objective things (otherwise you would test first before you write). Don't you understand? It is not about me. Test everything and then compare it and then write here that I'm showing crap. Please not before, thank you!

Peter Berger · Post by **Peter Berger** » Mon Mar 11, 2024 5:34 pm

Eduard wrote: ↑Mon Mar 11, 2024 5:25 pm Prove to me that this engine is worse. Show me things where Stockfish 070324 is better.

Nah, that's not how things are supposed to work. Even I can do a Stockfish clone ( although I have no interesting idea availlable right now and I am even no programmer) that could behave somewhat differently, and where it would be very hard work to prove that it is really clearly WORSE.

The burden of proof certainly is on you here, Eduard. Please don't get me wrong. I am on your side, I'd love to see you integrated in some great development scheme that may lead to even more interesting and stronger chess engines.

Modern Times · Post by **Modern Times** » Mon Mar 11, 2024 6:40 pm

Eduard wrote: ↑Mon Mar 11, 2024 5:25 pm I have just presented an engine here that I consider to be better overall than the engine developed by the developers from 070324. Prove to me that this engine is worse.

It is up to you to prove that it is better, if that is what you are claiming.

Dann Corbit · Post by **Dann Corbit** » Mon Mar 11, 2024 7:49 pm

I do not know if it is better at game play.
I do know that Leptir is better at problem solving.

chessica · Post by **chessica** » Mon Mar 11, 2024 10:00 pm

In my MEA test it came in third place ahead of Stockfish...

Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster

Re: Stockfish 070324 a Disaster