I'm disappointed with Stockfish dev.

Eduard · Post by **Eduard** » Wed Mar 08, 2023 6:51 am

How should I explain this?

For analysis I use several machines. In Freestyle, for example, I used a friend's engine to analyze with 64 cores. The engine was connected to my PC in the ChessBase Cloud. One or more engines were running in parallel on my PC. Which engine I use where depends on the type of position that has arisen (tactical/positional). For endgames with access to databases, I use a fast searching engine, because such an engine is better in endgames. It's good to have some favorite engines. Stockfish developer is not one of them for me.

Eduard · Post by **Eduard** » Wed Mar 08, 2023 8:09 am

On my homepage are now 4 of my favorite engines that I would use in tournaments, of course with engine updates (e.g. Freestyle, I hope there will be another tournament). My main engine will be Leptir Analyzer, which I will run on 64 cores.

Ipman recently tested this engine and emailed me the results yesterday:
---------------------------------

Code: Select all

First test using TC: 10s+1s

My Tournament  2023
                                 
1   Stockfish 15.1 avx512     +5  +46/=923/-31 50.75%  507.5/1000
2   Leptir Analyzer avx512    -5  +31/=923/-46 49.25%  492.5/1000

Second TC: 5m+0s

My Tournament  2023
                                 
1   Stockfish 15.1 avx512     +1  +22/=960/-18 50.20%  502.0/1000
2   Leptir Analyzer avx512    -1  +18/=960/-22 49.80%  498.0/1000

----------------------------------

Leptir Analyzer has the slowest search of any Stockfish clone engines I know. It is intended for fast computers and long analyses. Such a result with such short time controls is fantastic.

As a second engine, I would initially use Blue Marlin 15.6. If the position is positional, I would switch to Corchess. With endgame character positions (with access to Syzygy) I would take Leptir Speed, and finish the game with it.

DrEinstein · Post by **DrEinstein** » Wed Mar 08, 2023 11:24 am

Eduard wrote: ↑Wed Mar 08, 2023 8:09 am On my homepage are now 4 of my favorite engines that I would use in tournaments, of course with engine updates (e.g. Freestyle, I hope there will be another tournament). My main engine will be Leptir Analyzer, which I will run on 64 cores.

Ipman recently tested this engine and emailed me the results yesterday:
---------------------------------
Code: Select all
First test using TC: 10s+1s

My Tournament  2023
                                 
1   Stockfish 15.1 avx512     +5  +46/=923/-31 50.75%  507.5/1000
2   Leptir Analyzer avx512    -5  +31/=923/-46 49.25%  492.5/1000

Second TC: 5m+0s

My Tournament  2023
                                 
1   Stockfish 15.1 avx512     +1  +22/=960/-18 50.20%  502.0/1000
2   Leptir Analyzer avx512    -1  +18/=960/-22 49.80%  498.0/1000
----------------------------------

Leptir Analyzer has the slowest search of any Stockfish clone engines I know. It is intended for fast computers and long analyses. Such a result with such short time controls is fantastic.

As a second engine, I would initially use Blue Marlin 15.6. If the position is positional, I would switch to Corchess. With endgame character positions (with access to Syzygy) I would take Leptir Speed, and finish the game with it.

Doesn't look bad for Leptier. However, I have some questions and remarks.
Why Stockfish 15.1 and not Stockfish dev (on which Leptier is based)?
For both tournaments error bars are larger than the gap. On your hardware much more e.g. STC games could be played in a reasonable time!
What do you mean with "slowest search"? The lowest nps value or the largest time to (a certain) depth?

Eduard · Post by **Eduard** » Wed Mar 08, 2023 1:01 pm

I didn't know that he tested my engine and why in this way and not otherwise.

Here is his homepage:
https://ipmanchess.yolasite.com/

The search means how quickly the engine reaches a certain search depth.

By the way, the day before yesterday, Leptir Analyzer took first place in the Blitz tournament on PlayChess, by points, with only 4 cores and about 3000 kns. 17 rounds without a loss, one win.

DrEinstein · Post by **DrEinstein** » Wed Mar 08, 2023 1:48 pm

Eduard wrote: ↑Wed Mar 08, 2023 1:01 pm I didn't know that he tested my engine and why in this way and not otherwise.

Here is his homepage:
https://ipmanchess.yolasite.com/

The search means how quickly the engine reaches a certain search depth.

By the way, the day before yesterday, Leptir Analyzer took first place in the Blitz tournament on PlayChess, by points, with only 4 cores and about 3000 kns. 17 rounds without a loss, one win.

It's obvious that Leptier has a wide search and is thus good for analysis, were time doesn't play a major role, and in solving test suites. The larger time to depth, however, should probably result in a worse game playing, The above two tournaments have too large error bars. So they don't tell us much!

Again, why you do not make a STC tournament with 15 cores and stop it when the LOS is very close to 1. I would really like to know how much ELO Leptier is loosing in a h2h match vs Stockfish dev! If it's 5 to 10 Elo, who cares.
I'm afraid, that no one will ever do this test. Both sides seem to be afraid that a statistically correct result will not meet their expectations. And I mean the Elo difference, that LOS=1 in favour of SFdev should be clear for everyone, I hope.

Werewolf · Post by **Werewolf** » Wed Mar 08, 2023 6:59 pm

Magnum wrote: ↑Mon Mar 06, 2023 10:59 am
Eduard wrote: ↑Fri Feb 10, 2023 1:53 pm I'm disappointed with Stockfish dev!
Donate Frontier.
https://www.top500.org/
It should speed up the Stockfish development a little bit.
Probably 500 elo after 1 day.

Nothing like. There isn't 500 elo left in chess.
All you'd get is a lot of self play games giving a low error margin. But improvement doesn't come from Frontier directly but from good patches. That would take many, many months to reach 500 elo, if there was that much headroom left.

Eduard · Post by **Eduard** » Wed Mar 08, 2023 7:58 pm

DrEinstein wrote: ↑Wed Mar 08, 2023 1:48 pm
Eduard wrote: ↑Wed Mar 08, 2023 1:01 pm I didn't know that he tested my engine and why in this way and not otherwise.

Here is his homepage:
https://ipmanchess.yolasite.com/

The search means how quickly the engine reaches a certain search depth.

By the way, the day before yesterday, Leptir Analyzer took first place in the Blitz tournament on PlayChess, by points, with only 4 cores and about 3000 kns. 17 rounds without a loss, one win.
It's obvious that Leptier has a wide search and is thus good for analysis, were time doesn't play a major role, and in solving test suites. The larger time to depth, however, should probably result in a worse game playing, The above two tournaments have too large error bars. So they don't tell us much!

Again, why you do not make a STC tournament with 15 cores and stop it when the LOS is very close to 1. I would really like to know how much ELO Leptier is loosing in a h2h match vs Stockfish dev! If it's 5 to 10 Elo, who cares.
I'm afraid, that no one will ever do this test. Both sides seem to be afraid that a statistically correct result will not meet their expectations. And I mean the Elo difference, that LOS=1 in favour of SFdev should be clear for everyone, I hope.

You live in a world where only statistics count. Can you still enjoy individual nice games and nice analyses?

The Ipman test is enough for me. It shows that the engine can play well even with short time controls.
In my EN test 2022, it is the best engine.

Its good on the server too, almost unbelievable! The day before yesterday a shared first place and today the sole winner (Engine Piranha was eaten!)

.

The hardware of SUPERCOMPUTER (from a game against me):
EMAN 8.70 CLUSTER 64-bit AVX2 1,016,613kN/s 12 x AMD EPCY 7B12+7662 64-Core Processor 2250MHz, (768 cores, 1536 threads)

Leptir Analyzer hardware (from a game against me, he is a friend of mine and plays with my books):
{Leptir Analyzer-avx2 (4 cores): 32.3 plies; 2.723kN/s Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz 2594MHz, (4 cores, 8 threads)

The best 10:

CornfedForever · Post by **CornfedForever** » Wed Mar 08, 2023 8:55 pm

Eduard wrote: ↑Wed Mar 08, 2023 6:51 am How should I explain this?

For analysis I use several machines. In Freestyle, for example, I used a friend's engine to analyze with 64 cores. The engine was connected to my PC in the ChessBase Cloud. One or more engines were running in parallel on my PC. Which engine I use where depends on the type of position that has arisen (tactical/positional). For endgames with access to databases, I use a fast searching engine, because such an engine is better in endgames. It's good to have some favorite engines. Stockfish developer is not one of them for me.

But I did not ask about 'freestyle' (which is playing if I understand right. I was just talking about 'for analysis'...post opening, pre-tablebase endings.

syzygy · Post by **syzygy** » Wed Mar 08, 2023 9:34 pm

CornfedForever wrote: ↑Tue Mar 07, 2023 4:02 am
Eduard wrote: ↑Mon Mar 06, 2023 12:25 pm

A total of 73 parameters were changed here. Known parameters that are constantly changing. Let's see when one of these parameters will be changed again? It won't take too long.

And they wonder why I question how they can know which changes actually resulted in a positive change and which result in a negative change.

No, they see Dunning-Kruger at work.

CornfedForever · Post by **CornfedForever** » Wed Mar 08, 2023 10:06 pm

syzygy wrote: ↑Wed Mar 08, 2023 9:34 pm
CornfedForever wrote: ↑Tue Mar 07, 2023 4:02 am
Eduard wrote: ↑Mon Mar 06, 2023 12:25 pm

A total of 73 parameters were changed here. Known parameters that are constantly changing. Let's see when one of these parameters will be changed again? It won't take too long.

And they wonder why I question how they can know which changes actually resulted in a positive change and which result in a negative change.
No, they see Dunning-Kruger at work.

Enough with what is essentially name calling rather than an argument. I'm talking about the data and not knowing with any real certainly how you get to a + elo or a -elo (outside of tollerance) because so much is tested together. I mean...if every patch was a positive...SF would be increasing in strength every week. It is not.

I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.

Re: I'm disappointed with Stockfish dev.