Chess Engine Leptir

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

gordonr
Posts: 223
Joined: Thu Aug 06, 2009 8:04 pm
Location: UK

Re: Chess Engine Leptir

Post by gordonr »

Eduard wrote: Sat Mar 18, 2023 2:05 pm Unbelievable!

I implemented a POSITION in my test set that was criticized here.

This (position 110):

1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16

I just got a game played with Eman 8.81 from a supercomputer with 1500 threads! And what happened? It's hard to believe, but Eman played the immediate losing move 18...Nxf4 {3.38/33 33} after 33s.

[pgn][Event "Rated game, 5 min"]
[Site "Engine Room"]
[Date "2023.03.18"]
[Round "?"]
[White "K-999, Stockfish dev-20230"]
[Black "CLUSTER, EMAN 8.81 CLUSTER 6"]
[Result "1-0"]
[ECO "B97"]
[WhiteElo "2942"]
[BlackElo "2941"]
[Annotator "5.27;0.01"]
[PlyCount "147"]
[EventDate "2023.03.18"]
[SourceTitle "playchess.com"]
[TimeControl "300"]

{Stockfish dev-20230314-f0556dcb (36 cores): 37.5 plies; 39,297kN/s Intel(R) Xeon(R) CPU E5-2686 v3 @ 2.00GHz 1995MHz, (36 cores, 72 threads), Shortbook 1.2.ctg, 0 MB} 1. e4 {B 1} c5 {B 0} 2. Nf3 {B 0} d6 {B 0} 3. d4 {B 0} cxd4 {B 0} 4. Nxd4 {B 0} Nf6 {B 0} 5. Nc3 {B 0} a6 {B 0} 6. Bg5 {B 0} e6 {B 0} 7. f4 {B 0} Qb6 {B 0} 8. Nb3 {B 0} Nbd7 {0.01/33 6} 9. Qe2 {B 0 (Qd2)} Qc7 {-0.09/39 5} 10. O-O-O {B 0 (a3)} b5 {0.00/45 28} 11. a3 {B 0} Be7 {0.01/39 18} 12. Kb1 {B 0} Rb8 {0.00/36 3} 13. Qe1 {B 0} Bb7 {0.00/46 17} 14. Bd3 {B 0} O-O {0.00/40 2} 15. Qh4 {B 0 (Rf1)} h6 {0.00/42 4} 16. Bxh6 {B 0} gxh6 {0.00/46 3} 17. g4 {B 0 (Qxh6)} Nd5 {-1.81/29 6} 18. Qxh6 {B 0 (g5)} Nxf4 {3.38/33 33} 19. e5 {B 0} Nxd3 {3.48/31 8} 20. Rxd3 {B 0} Bg2 {3.54/30 9} 21. exd6 {B 0} Bxd6 {3.59/33 9} 22. Qg5+ {B 0} Kh7 {3.77/28 11} 23. Rhd1 {B 0} b4 {3.70/28 4} 24. Rxd6 {B 0} bxc3 {3.61/25 4} 25. Rxd7 {B 0} Qxh2 {3.71/27 7} 26. Qf6 {B 0} Kg8 {3.67/26 7} 27. Qxc3 {B 0} Qh8 {3.92/27 14} 28. Qe3 {B 0 (R7d4)} Rb5 {3.27/23 2} 29. Rg1 {B 0} Re5 {3.91/27 10} 30. Qf2 {B 0} Bd5 {4.47/27 7} 31. Nc5 {B 0} Qh3 {4.53/28 6} 32. Rd6 {B 0} Qe3 {4.67/27 4} 33. Qxe3 {B 0} Rxe3 {4.54/23 1} 34. Nd7 {B 0} Rc8 {4.70/27 3} 35. Nf6+ {B 0} Kg7 {4.81/20 0} 36. Nxd5 {B 0} exd5 {4.80/22 0} 37. Rg2 {B 0} a5 {4.73/31 2} 38. Rxd5 {5.27/39 22} a4 {5.70/35 0} 39. Rd4 {5.49/33 9} Ra8 {6.14/31 0} 40. Rf2 {5.60/35 9} Re7 {6.19/31 0} 41. Rff4 {5.66/37 7} Rea7 {6.19/28 0} 42. Rb4 {5.70/45 9 (Ka2)} Kg6 {6.19/19 1} 43. Rb6+ {5.78/31 7} Kg7 {6.52/29 0} 44. c3 {5.80/61 15} Kg8 {6.40/34 0} 45. Ka2 {6.00/35 26} Kf8 {6.57/31 0} 46. Rb5 {6.13/29 6 (Rd4)} Re8 {6.44/21 1} 47. Rbb4 {6.25/28 5} Rea8 {6.32/24 0} 48. c4 {6.59/29 13 (Rf5)} Rc8 {7.06/20 1} 49. Rb5 {6.76/26 5} Ke7 {7.11/26 0} 50. c5 {6.95/26 5 (Rbf5)} f6 {6.60/18 1} 51. Rc4 {7.25/26 5 (Rbb4)} Kd7 {5.95/14 0} 52. Rb6 {7.52/25 5 (Rbb4)} Rc6 {5.81/12 0} 53. Rd4+ {198.57/25 23 (Rbb4)} Kc7 {5.88/13 0} 54. Rxc6+ {199.15/30 4 (Rf4)} Kxc6 {4.99/14 0} 55. Rd6+ {199.47/33 7} Kxc5 {7.43/30 0} 56. Rxf6 {199.57/33 3} Rg7 {8.32/37 0} 57. Rf5+ {199.65/34 3 (Rf4)} Kb6 {7.41/14 0} 58. g5 {199.72/48 3 (Rf4)} Kc7 {4.62/9 0} 59. Rf4 {199.84/74 5 (Ra5)} Rxg5 {9.45/13 0} 60. Rxa4 {199.99/78 24} Kb6 {13.75/44 0} 61. Rc4 {200.00/75 6 (Rb4+)} Re5 {9.51/12 0} 62. Rb4+ {200.00/71 2 (a4)} Kc5 {8.50/12 0} 63. Rf4 {200.00/77 1} Re1 {13.75/36 0} 64. a4 {200.00/41 2} Re3 {13.75/36 0} 65. b3 {200.00/65 1} Kb6 {13.75/39 0} 66. Ka3 {200.00/74 1 (Rb4+)} Rxb3+ {7.79/15 0} 67. Kxb3 {#10/72 2} Ka6 {87.95/18 0} 68. Rb4 {#7/245 1 (Rf6+)} Ka7 {15.17/17 0} 69. Kc4 {#6/245 0} Ka6 {#5/48 0} 70. Kc5 {#5/245 0} Ka7 {#4/248 0} 71. Rb6 {#4/245 0} Ka8 {#3/248 0} 72. Kc6 {#3/245 0} Ka7 {#2/248 0} 73. Kc7 {#2/245 0} Ka8 {#1/248 0} 74. Ra6# {#1/245 0} 1-0[/pgn]

Something like this happens on the server in series. :-)

However, I believe that Eman Cluster was compiled or configured incorrectly. Playing something like this with 1500 threads is amazing bad.
The useful test position is not your one, i.e.

[d] 1r3rk1/1bqnbpp1/p2ppn1B/1p6/4PP1Q/PNNB4/1PP3PP/1K1R3R b - - 0 16

since gxh6 is not the losing move.

But based on your game, this is the relevant test position where Black goes wrong:

[d] 1r3rk1/1bqnbp2/p2ppn1p/1p6/4PPPQ/PNNB4/1PP4P/1K1R3R b - g3 0 17

Test that the losing game move 17...Nd5?? is not played. Something like Rfd8 should be chosen instead (there may be other candidates)
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Chess Engine Leptir

Post by Eduard »

Hello folks,
As you know, I don't tune my engine for Bum Bum Bullet, nor for Blitz. Now that I've done some bullet-level testing of my own, I've been given even more reassurance that I'm going to keep going. However, I have decided not to offer any more engines for public download.

Reason: The Blitz and Bullet results suggest the impression that the engine that achieves better results with Bullet and Blitz plays better.

My engine Leptir Analyzer plays weaker than Stockfish dev. in Blitz, but only by about 2 to 5 Elo. This is a great result because in position tests, this engine is clearly better than Stockfish dev. However, many critical practical positions are only solved by Leptir Analyzer after more than 30s. If I only tested with 10s/move (that's about the average time per move in blitz games) then Leptir Analyzer wouldn't be any better than other engines, it would be rather weaker - blitz games show that too!

Anyone who tests this engine in Blitzgames will not find the strength of this engine.

I have now carried out another experiment. Leptir N1 (see test results above) has been reprogrammed. I made the pruning a bit sharper and changed the pieces values. The rest is identical to Leptir N1. Charisma 180323 was created. This engine does not use the green NNUE network but the "nn-c232c4319bdd.nnue" which is tactically better in my opinion. In the position test, the engine manages 96 positions, Stockfish dev 91.

Then I started a quick test against Stockfish dev 140323.

Ryzen 3900X
3 cores/engine 7500 kns in Startposition
GUI Powerfritz 18
Hash 128 MB
all 3456men Syzygy GUI and Engine
Ponder ON
Timecontrol 60s + 0.1s
Noomen 3move book,

The result after 282 games:

Image

Download all games:
https://pixeldrain.com/u/kUAWKpvT

Such tunings are still useless for me. Anyone who believes that engine A is better than engine B here, only if a bullet test is won, should use such engines. But I'm absolutely sure that it doesn't matter which engine you use in Server Blitzgames. Whether you use Leptir Analyzer or Charisma or Stockfish dev, it won't matter. The opening theory or the book you use is more important!

During the tests with the 3move book, I noticed that it is always the same variants that lose. These are variants of French, Caro-Kann, Pirc, Modern Opening, King's Indian.

This is also known in the server game.

Here is the losing variant where in the match Charisma vs Stockfish, both engines lost with the black pieces:

Image
N2k1bnr/pp3ppp/2n1p3/3pPb2/q1pP4/4BN2/P1P1BPPP/1R2QRK1 b - - 0 13

I had seen this position on the board many times myself, in my own games or in my friends' games. I have implemented 7 games in my little match book:

Image

And who plays this variant? Stockfish or the clones, with both white and black. And what's the point? The point is that in blitz and bullet games it is sometimes a coincidence that the exact this line is made. Sometimes a different line is chosen, but only by chance and not because it was recognized as better! And as I said before: it's always the same variants. Everything else ends in a draw!
jdart
Posts: 4405
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Chess Engine Leptir

Post by jdart »

I find it ridiculous that operators of online computer accounts care about getting and keeping a high rating.
For example, there's a high-rated account now on FICS that will not play accounts rated over 2600. It only plays lower rated accounts.
The ratings are meaningless, because accounts are playing against a small pool of players, and can be manipulated by restricting that pool even further. They also are not comparable to FIDE ratings or comparable across servers. It's just a number.
User avatar
Eelco de Groot
Posts: 4666
Joined: Sun Mar 12, 2006 2:40 am
Full name:   Eelco de Groot

Re: Chess Engine Leptir

Post by Eelco de Groot »

jdart wrote: Sun Mar 19, 2023 3:16 pm I find it ridiculous that operators of online computer accounts care about getting and keeping a high rating.
For example, there's a high-rated account now on FICS that will not play accounts rated over 2600. It only plays lower rated accounts.
The ratings are meaningless, because accounts are playing against a small pool of players, and can be manipulated by restricting that pool even further. They also are not comparable to FIDE ratings or comparable across servers. It's just a number.
I was just thinking that to sum up Eduard's argument it seemed to me he is just saying that Elo is meaningless, under certain circumstances. And this is just another example of that. I have not followed any of the discussion so far so just remark from the sideline. Thanks, as Chessfun would say!
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Chess Engine Leptir

Post by Eduard »

The new star on the PlayChess server is currently Engine Dark SisTer 4.6. A player who has 3 accounts has been playing with it for a few days. Is he playing? Of course not. With 3 accounts that won a total of 7x against Raspberry PI 3, he was pushed over Elo 3000 in Blitz. He hasn't played blitz for days (only 16 min games now) and is at the top of the table, basking in the sun.

I wouldn't say anything, that's normal on this server, just: Is it normal if all these 3 accounts belong to a known correspondence chess grandmaster? Yes, you read correctly a correspondence chess grandmaster - praises himself that he has won against 50 kns (Raspberry Pi 3 with Stockfish) 7x, I counted all his recent wins. Now correspondence chess GMs are crazy about online blitz Elos. I was not expecting that, oh no! :shock:
CornfedForever
Posts: 648
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: Chess Engine Leptir

Post by CornfedForever »

Eduard wrote: Sun Mar 19, 2023 4:34 pm The new star on the PlayChess server is currently Engine Dark SisTer 4.6. A player who has 3 accounts has been playing with it for a few days. Is he playing? Of course not. With 3 accounts that won a total of 7x against Raspberry PI 3, he was pushed over Elo 3000 in Blitz. He hasn't played blitz for days (only 16 min games now) and is at the top of the table, basking in the sun.

I wouldn't say anything, that's normal on this server, just: Is it normal if all these 3 accounts belong to a known correspondence chess grandmaster? Yes, you read correctly a correspondence chess grandmaster - praises himself that he has won against 50 kns (Raspberry Pi 3 with Stockfish) 7x, I counted all his recent wins. Now correspondence chess GMs are crazy about online blitz Elos. I was not expecting that, oh no! :shock:
The phrase 'Correspondence GM' is an anachronism - has been for...20 or more years. It's time we put up a marker and threw some dirt over it.

In its place is an ego driven exercise pumped up by technology that can easily boost anyone's 'rating'' beyond the ICCF 2398 rating I achieved some 25 yrs ago. People...seem to need to feel good about themselves and are perfectly okay with letting technology do it for them. Welcome to 2023.
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Chess Engine Leptir

Post by Eduard »

The GM I'm talking about isn't just anyone. He is Vice European Champion, and more.
CornfedForever
Posts: 648
Joined: Mon Jun 20, 2022 4:08 am
Full name: Brian D. Smith

Re: Chess Engine Leptir

Post by CornfedForever »

Eduard wrote: Sun Mar 19, 2023 6:18 pm The GM I'm talking about isn't just anyone. He is Vice European Champion, and more.
Sarana? I didn't know he played 'correspondence'.
User avatar
Graham Banks
Posts: 44375
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Chess Engine Leptir

Post by Graham Banks »

CornfedForever wrote: Sun Mar 19, 2023 5:25 pmThe phrase 'Correspondence GM' is an anachronism - has been for...20 or more years. It's time we put up a marker and threw some dirt over it.

In its place is an ego driven exercise pumped up by technology that can easily boost anyone's 'rating'' beyond the ICCF 2398 rating I achieved some 25 yrs ago. People...seem to need to feel good about themselves and are perfectly okay with letting technology do it for them. Welcome to 2023.
Indeed.
I'm glad that I stopped playing correspondence chess in 1996.
I just can't understand what personal satisfaction one could derive from using chess engines to help them.
gbanksnz at gmail.com
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Chess Engine Leptir

Post by Eduard »

I will now test my engines in three ways.

1. With my position test 2023 (it is possible that I will implement more positions during the year, also corrections).
2. On the server with live games
3. With offline games

For my offline games I have selected 100 variants that also occur in practice:

Download EN-Tournament (PGN and CBH):
https://pixeldrain.com/u/L8P38FTU

These are variants that fit quite well, if someone sees it differently, it would be nice if you said why? If anyone has other variants that are good for ENG matches and used in practice, it would be nice if you post them here. Thanks.

I will let my future clones that I use to play on the server, play with these 100 variants against Stockfish dev.

I chose the level 120s + 0.1s. With Ponder ON (important for me) and 7500 kns/engine, one run takes about 10 hours. With longer games it takes much longer, and with shorter times I could implement more variants, but less than 120s is of no interest to me.

Ryzen 3900X
3 cores/engine 7500 kns in Startposition
GUI Powerfritz 18
Hash 128 MB --> or 256 MB
all 3456men Syzygy GUI and Engine
Ponder ON
Timecontrol 120s + 0.1s
EN-Tournament (100 Variants)

There is currently a match going on between Charisma 180323 vs Stockfish dev 140323, and after 120 games standing is balanced. Charisma 180323 made 3rd place in my position test (unofficially) with 96 solved positions. My goal is to create a clone that is better than Stockfish dev in position tests, and can stand up to him in the ENG match!