Chess Engine Leptir

gordonr · Post by **gordonr** » Sat Mar 18, 2023 10:45 pm

Eduard wrote: ↑Sat Mar 18, 2023 2:05 pm Unbelievable!

I implemented a POSITION in my test set that was criticized here.

This (position 110):

1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16

I just got a game played with Eman 8.81 from a supercomputer with 1500 threads! And what happened? It's hard to believe, but Eman played the immediate losing move 18...Nxf4 {3.38/33 33} after 33s.

[pgn][Event "Rated game, 5 min"]
[Site "Engine Room"]
[Date "2023.03.18"]
[Round "?"]
[White "K-999, Stockfish dev-20230"]
[Black "CLUSTER, EMAN 8.81 CLUSTER 6"]
[Result "1-0"]
[ECO "B97"]
[WhiteElo "2942"]
[BlackElo "2941"]
[Annotator "5.27;0.01"]
[PlyCount "147"]
[EventDate "2023.03.18"]
[SourceTitle "playchess.com"]
[TimeControl "300"]

{Stockfish dev-20230314-f0556dcb (36 cores): 37.5 plies; 39,297kN/s Intel(R) Xeon(R) CPU E5-2686 v3 @ 2.00GHz 1995MHz, (36 cores, 72 threads), Shortbook 1.2.ctg, 0 MB} 1. e4 {B 1} c5 {B 0} 2. Nf3 {B 0} d6 {B 0} 3. d4 {B 0} cxd4 {B 0} 4. Nxd4 {B 0} Nf6 {B 0} 5. Nc3 {B 0} a6 {B 0} 6. Bg5 {B 0} e6 {B 0} 7. f4 {B 0} Qb6 {B 0} 8. Nb3 {B 0} Nbd7 {0.01/33 6} 9. Qe2 {B 0 (Qd2)} Qc7 {-0.09/39 5} 10. O-O-O {B 0 (a3)} b5 {0.00/45 28} 11. a3 {B 0} Be7 {0.01/39 18} 12. Kb1 {B 0} Rb8 {0.00/36 3} 13. Qe1 {B 0} Bb7 {0.00/46 17} 14. Bd3 {B 0} O-O {0.00/40 2} 15. Qh4 {B 0 (Rf1)} h6 {0.00/42 4} 16. Bxh6 {B 0} gxh6 {0.00/46 3} 17. g4 {B 0 (Qxh6)} Nd5 {-1.81/29 6} 18. Qxh6 {B 0 (g5)} Nxf4 {3.38/33 33} 19. e5 {B 0} Nxd3 {3.48/31 8} 20. Rxd3 {B 0} Bg2 {3.54/30 9} 21. exd6 {B 0} Bxd6 {3.59/33 9} 22. Qg5+ {B 0} Kh7 {3.77/28 11} 23. Rhd1 {B 0} b4 {3.70/28 4} 24. Rxd6 {B 0} bxc3 {3.61/25 4} 25. Rxd7 {B 0} Qxh2 {3.71/27 7} 26. Qf6 {B 0} Kg8 {3.67/26 7} 27. Qxc3 {B 0} Qh8 {3.92/27 14} 28. Qe3 {B 0 (R7d4)} Rb5 {3.27/23 2} 29. Rg1 {B 0} Re5 {3.91/27 10} 30. Qf2 {B 0} Bd5 {4.47/27 7} 31. Nc5 {B 0} Qh3 {4.53/28 6} 32. Rd6 {B 0} Qe3 {4.67/27 4} 33. Qxe3 {B 0} Rxe3 {4.54/23 1} 34. Nd7 {B 0} Rc8 {4.70/27 3} 35. Nf6+ {B 0} Kg7 {4.81/20 0} 36. Nxd5 {B 0} exd5 {4.80/22 0} 37. Rg2 {B 0} a5 {4.73/31 2} 38. Rxd5 {5.27/39 22} a4 {5.70/35 0} 39. Rd4 {5.49/33 9} Ra8 {6.14/31 0} 40. Rf2 {5.60/35 9} Re7 {6.19/31 0} 41. Rff4 {5.66/37 7} Rea7 {6.19/28 0} 42. Rb4 {5.70/45 9 (Ka2)} Kg6 {6.19/19 1} 43. Rb6+ {5.78/31 7} Kg7 {6.52/29 0} 44. c3 {5.80/61 15} Kg8 {6.40/34 0} 45. Ka2 {6.00/35 26} Kf8 {6.57/31 0} 46. Rb5 {6.13/29 6 (Rd4)} Re8 {6.44/21 1} 47. Rbb4 {6.25/28 5} Rea8 {6.32/24 0} 48. c4 {6.59/29 13 (Rf5)} Rc8 {7.06/20 1} 49. Rb5 {6.76/26 5} Ke7 {7.11/26 0} 50. c5 {6.95/26 5 (Rbf5)} f6 {6.60/18 1} 51. Rc4 {7.25/26 5 (Rbb4)} Kd7 {5.95/14 0} 52. Rb6 {7.52/25 5 (Rbb4)} Rc6 {5.81/12 0} 53. Rd4+ {198.57/25 23 (Rbb4)} Kc7 {5.88/13 0} 54. Rxc6+ {199.15/30 4 (Rf4)} Kxc6 {4.99/14 0} 55. Rd6+ {199.47/33 7} Kxc5 {7.43/30 0} 56. Rxf6 {199.57/33 3} Rg7 {8.32/37 0} 57. Rf5+ {199.65/34 3 (Rf4)} Kb6 {7.41/14 0} 58. g5 {199.72/48 3 (Rf4)} Kc7 {4.62/9 0} 59. Rf4 {199.84/74 5 (Ra5)} Rxg5 {9.45/13 0} 60. Rxa4 {199.99/78 24} Kb6 {13.75/44 0} 61. Rc4 {200.00/75 6 (Rb4+)} Re5 {9.51/12 0} 62. Rb4+ {200.00/71 2 (a4)} Kc5 {8.50/12 0} 63. Rf4 {200.00/77 1} Re1 {13.75/36 0} 64. a4 {200.00/41 2} Re3 {13.75/36 0} 65. b3 {200.00/65 1} Kb6 {13.75/39 0} 66. Ka3 {200.00/74 1 (Rb4+)} Rxb3+ {7.79/15 0} 67. Kxb3 {#10/72 2} Ka6 {87.95/18 0} 68. Rb4 {#7/245 1 (Rf6+)} Ka7 {15.17/17 0} 69. Kc4 {#6/245 0} Ka6 {#5/48 0} 70. Kc5 {#5/245 0} Ka7 {#4/248 0} 71. Rb6 {#4/245 0} Ka8 {#3/248 0} 72. Kc6 {#3/245 0} Ka7 {#2/248 0} 73. Kc7 {#2/245 0} Ka8 {#1/248 0} 74. Ra6# {#1/245 0} 1-0[/pgn]

Something like this happens on the server in series.

However, I believe that Eman Cluster was compiled or configured incorrectly. Playing something like this with 1500 threads is amazing bad.

The useful test position is not your one, i.e.

[d] 1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16

since gxh6 is not the losing move.

But based on your game, this is the relevant test position where Black goes wrong:

[d] 1r3rk1/1bqnbp2/p2ppn1p/1p6/4PPPQ/PNNB4/1PP4P/1K1R3R b - g3 0 17

Test that the losing game move 17...Nd5?? is not played. Something like Rfd8 should be chosen instead (there may be other candidates)

Eduard · Post by **Eduard** » Sun Mar 19, 2023 2:40 pm

Hello folks,
As you know, I don't tune my engine for Bum Bum Bullet, nor for Blitz. Now that I've done some bullet-level testing of my own, I've been given even more reassurance that I'm going to keep going. However, I have decided not to offer any more engines for public download.

Reason: The Blitz and Bullet results suggest the impression that the engine that achieves better results with Bullet and Blitz plays better.

My engine Leptir Analyzer plays weaker than Stockfish dev. in Blitz, but only by about 2 to 5 Elo. This is a great result because in position tests, this engine is clearly better than Stockfish dev. However, many critical practical positions are only solved by Leptir Analyzer after more than 30s. If I only tested with 10s/move (that's about the average time per move in blitz games) then Leptir Analyzer wouldn't be any better than other engines, it would be rather weaker - blitz games show that too!

Anyone who tests this engine in Blitzgames will not find the strength of this engine.

I have now carried out another experiment. Leptir N1 (see test results above) has been reprogrammed. I made the pruning a bit sharper and changed the pieces values. The rest is identical to Leptir N1. Charisma 180323 was created. This engine does not use the green NNUE network but the "nn-c232c4319bdd.nnue" which is tactically better in my opinion. In the position test, the engine manages 96 positions, Stockfish dev 91.

Then I started a quick test against Stockfish dev 140323.

Ryzen 3900X
3 cores/engine 7500 kns in Startposition
GUI Powerfritz 18
Hash 128 MB
all 3456men Syzygy GUI and Engine
Ponder ON
Timecontrol 60s + 0.1s
Noomen 3move book,

The result after 282 games:

Download all games:
https://pixeldrain.com/u/kUAWKpvT

Such tunings are still useless for me. Anyone who believes that engine A is better than engine B here, only if a bullet test is won, should use such engines. But I'm absolutely sure that it doesn't matter which engine you use in Server Blitzgames. Whether you use Leptir Analyzer or Charisma or Stockfish dev, it won't matter. The opening theory or the book you use is more important!

During the tests with the 3move book, I noticed that it is always the same variants that lose. These are variants of French, Caro-Kann, Pirc, Modern Opening, King's Indian.

This is also known in the server game.

Here is the losing variant where in the match Charisma vs Stockfish, both engines lost with the black pieces:

N2k1bnr/pp3ppp/2n1p3/3pPb2/q1pP4/4BN2/P1P1BPPP/1R2QRK1 b - - 0 13

I had seen this position on the board many times myself, in my own games or in my friends' games. I have implemented 7 games in my little match book:

And who plays this variant? Stockfish or the clones, with both white and black. And what's the point? The point is that in blitz and bullet games it is sometimes a coincidence that the exact this line is made. Sometimes a different line is chosen, but only by chance and not because it was recognized as better! And as I said before: it's always the same variants. Everything else ends in a draw!

jdart · Post by **jdart** » Sun Mar 19, 2023 3:16 pm

I find it ridiculous that operators of online computer accounts care about getting and keeping a high rating.
For example, there's a high-rated account now on FICS that will not play accounts rated over 2600. It only plays lower rated accounts.
The ratings are meaningless, because accounts are playing against a small pool of players, and can be manipulated by restricting that pool even further. They also are not comparable to FIDE ratings or comparable across servers. It's just a number.

Eelco de Groot · Post by **Eelco de Groot** » Sun Mar 19, 2023 3:30 pm

jdart wrote: ↑Sun Mar 19, 2023 3:16 pm I find it ridiculous that operators of online computer accounts care about getting and keeping a high rating.
For example, there's a high-rated account now on FICS that will not play accounts rated over 2600. It only plays lower rated accounts.
The ratings are meaningless, because accounts are playing against a small pool of players, and can be manipulated by restricting that pool even further. They also are not comparable to FIDE ratings or comparable across servers. It's just a number.

I was just thinking that to sum up Eduard's argument it seemed to me he is just saying that Elo is meaningless, under certain circumstances. And this is just another example of that. I have not followed any of the discussion so far so just remark from the sideline. Thanks, as Chessfun would say!

Eduard · Post by **Eduard** » Sun Mar 19, 2023 4:34 pm

The new star on the PlayChess server is currently Engine Dark SisTer 4.6. A player who has 3 accounts has been playing with it for a few days. Is he playing? Of course not. With 3 accounts that won a total of 7x against Raspberry PI 3, he was pushed over Elo 3000 in Blitz. He hasn't played blitz for days (only 16 min games now) and is at the top of the table, basking in the sun.

I wouldn't say anything, that's normal on this server, just: Is it normal if all these 3 accounts belong to a known correspondence chess grandmaster? Yes, you read correctly a correspondence chess grandmaster - praises himself that he has won against 50 kns (Raspberry Pi 3 with Stockfish) 7x, I counted all his recent wins. Now correspondence chess GMs are crazy about online blitz Elos. I was not expecting that, oh no!

CornfedForever · Post by **CornfedForever** » Sun Mar 19, 2023 5:25 pm

Eduard wrote: ↑Sun Mar 19, 2023 4:34 pm The new star on the PlayChess server is currently Engine Dark SisTer 4.6. A player who has 3 accounts has been playing with it for a few days. Is he playing? Of course not. With 3 accounts that won a total of 7x against Raspberry PI 3, he was pushed over Elo 3000 in Blitz. He hasn't played blitz for days (only 16 min games now) and is at the top of the table, basking in the sun.

I wouldn't say anything, that's normal on this server, just: Is it normal if all these 3 accounts belong to a known correspondence chess grandmaster? Yes, you read correctly a correspondence chess grandmaster - praises himself that he has won against 50 kns (Raspberry Pi 3 with Stockfish) 7x, I counted all his recent wins. Now correspondence chess GMs are crazy about online blitz Elos. I was not expecting that, oh no!

The phrase 'Correspondence GM' is an anachronism - has been for...20 or more years. It's time we put up a marker and threw some dirt over it.

In its place is an ego driven exercise pumped up by technology that can easily boost anyone's 'rating'' beyond the ICCF 2398 rating I achieved some 25 yrs ago. People...seem to need to feel good about themselves and are perfectly okay with letting technology do it for them. Welcome to 2023.

Eduard · Post by **Eduard** » Sun Mar 19, 2023 6:18 pm

The GM I'm talking about isn't just anyone. He is Vice European Champion, and more.

CornfedForever · Post by **CornfedForever** » Mon Mar 20, 2023 1:33 am

Eduard wrote: ↑Sun Mar 19, 2023 6:18 pm The GM I'm talking about isn't just anyone. He is Vice European Champion, and more.

Sarana? I didn't know he played 'correspondence'.

Graham Banks · Post by **Graham Banks** » Mon Mar 20, 2023 1:43 am

CornfedForever wrote: ↑Sun Mar 19, 2023 5:25 pmThe phrase 'Correspondence GM' is an anachronism - has been for...20 or more years. It's time we put up a marker and threw some dirt over it.

In its place is an ego driven exercise pumped up by technology that can easily boost anyone's 'rating'' beyond the ICCF 2398 rating I achieved some 25 yrs ago. People...seem to need to feel good about themselves and are perfectly okay with letting technology do it for them. Welcome to 2023.

Indeed.
I'm glad that I stopped playing correspondence chess in 1996.
I just can't understand what personal satisfaction one could derive from using chess engines to help them.

Eduard · Post by **Eduard** » Tue Mar 21, 2023 5:49 pm

I will now test my engines in three ways.

1. With my position test 2023 (it is possible that I will implement more positions during the year, also corrections).
2. On the server with live games
3. With offline games

For my offline games I have selected 100 variants that also occur in practice:

Download EN-Tournament (PGN and CBH):
https://pixeldrain.com/u/L8P38FTU

These are variants that fit quite well, if someone sees it differently, it would be nice if you said why? If anyone has other variants that are good for ENG matches and used in practice, it would be nice if you post them here. Thanks.

I will let my future clones that I use to play on the server, play with these 100 variants against Stockfish dev.

I chose the level 120s + 0.1s. With Ponder ON (important for me) and 7500 kns/engine, one run takes about 10 hours. With longer games it takes much longer, and with shorter times I could implement more variants, but less than 120s is of no interest to me.

Ryzen 3900X
3 cores/engine 7500 kns in Startposition
GUI Powerfritz 18
Hash 128 MB --> or 256 MB
all 3456men Syzygy GUI and Engine
Ponder ON
Timecontrol 120s + 0.1s
EN-Tournament (100 Variants)

There is currently a match going on between Charisma 180323 vs Stockfish dev 140323, and after 120 games standing is balanced. Charisma 180323 made 3rd place in my position test (unofficially) with 96 solved positions. My goal is to create a clone that is better than Stockfish dev in position tests, and can stand up to him in the ENG match!

Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir

Re: Chess Engine Leptir