Kayra Tests

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

mehmet123
Posts: 692
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Kayra Tests

Post by mehmet123 »

Eduard wrote: Tue Nov 15, 2022 3:03 pm Woblis Rating List:

Rank Name Elo + - games score oppo. draws
1 Kayra 1.5 avx2 3503 12 12 2200 71% 3372 57%
2 Kayra 1.6 avx2 3495 11 11 2500 62% 3422 73%
3 Kayra 1.4 avx2 3490 12 12 1900 64% 3406 70%
4 CorChess 3 171022 3488 16 15 1300 74% 3344 52%
5 Swordfish 15.3a-avx2 3485 21 21 500 51% 3481 90%

When I see this, I have to wonder why no one plays Kayra on PlayChess today. Most people use Stockfish dev. :roll:
And these are people who play day and night and know every engine! Another aspect: Engines that can learn, are more popular than other Stockfish clones. I understand that. If you play day and night and many thousands of games, you have a large learning file. Many of the players even use this file as a book. It's difficult to win against such players.
It is not surprising that people choose Stockfish Dev, as a large number of people think that the latest Stockfish version is the most powerful chess engine. Not counting Kayra and some Stockfish derivative chess engines, the latest Stockfish version is not the most powerful chess engine. Although many green patches were added to Stockfish later, Stockfish 05/10/22 stands slightly stronger than the Stockfish 30/10/22 in VLTC test. For last 6 months the progress is lower than 3 elo and for last 7 months the progress is lower than 6 elo at VLTC (8 threads/ 30 sec + 0.3 sec). This time control is close to 4.5 min/game. But what about more longer time control matches. Probably the progress is in the range 0 - 5 elo for last 7 months since Stockfish 15.

At Fishtest the tests made at 10 sec + 0.1 sec and 60+0.6 sec. % 90 of patches that can pass 10 sec + 0.1 sec tests can't pass 60+0.6 sec tests. But what about longer time controls. How many patches that can pass 60+0.6 sec tests can pass 5 min, 10 min , 30 min or longer time control tests. I don't think this rate will be too high.
https://github.com/glinscott/fishtest/w ... sion-Tests
Eduard
Posts: 1439
Joined: Sat Oct 27, 2018 12:58 am
Location: Germany
Full name: N.N.

Re: Kayra Tests

Post by Eduard »

That's one reason why I'm also building my own engine. The Stockfish developers only test with Bullet. What are they supposed to do? You need a lot of games to see if a change brings progress. I look at a lot of Fishcooking codes and parameters. I then test changes that look promising myself. However, most of these changes achieve poorer results for me in analysis mode. If you analyze a lot like me and have experience, you can see that. But some codes are also good. I can always improve something. In the third step, I then test on PlayChess.com. I am not alone in this. I found three friends who test with me on Playchess. Yesterday we all were in the top 5. The analysis mode is the most important to me. Leptir 4 is impressively good for analysis. When we're good live on the server, it's all the better.

Example: In Leptir 4 I have implemented this code, the values ​​are my own. You cannot adopt the values ​​for all engines because Leptir is less selective than Stockfish and has many other codes and parameters different from Stockfish dev.

const bool doDeeperSearch = value > (alpha + 78 + 11 * (newDepth - d));
const bool doShallowerSearch = value < bestValue + 12;
value = -search<NonPV>(pos, ss+1, -(alpha+1), -alpha, std::max(1, newDepth + doDeeperSearch - doShallowerSearch), !cutNode);

Here is a nice win against Lc0 at Timecontrol 16m. Such games are my fun! :))

[pgn]Event "Rated game, 16 min"]
[Site "Engine Room"]
[Date "2022.11.15"]
[Round "?"]
[White "Solista, Leptir 4"]
[Black "Lc0 v0.29.0-rc0"]
[Result "1-0"]
[ECO "A28"]
[WhiteElo "2425"]
[BlackElo "2458"]
[Annotator "0.14;0.08"]
[PlyCount "80"]
[EventDate "2022.11.15"]
[SourceTitle "playchess.com"]
[TimeControl "960"]
{Lc0 v0.29.0-rc0 (36 threads): 13.9 plies; 21kN/s Intel(R) Core(TM) i9-7980XE CPU @ 2.60GHz 2592MHz, (18 cores, 36 threads), Solista Attack v3.2.ctg, 2048 MB} 1. c4 {B 0} e5 {B 0} 2. Nc3 {B 0} Nf6 {B 0} 3. Nf3 {B 0} Nc6 {B 0} 4. e4 {B 0} Bb4 {B 0} 5. d3 {B 0} d6 {B 0} 6. a3 {B 0} Bc5 {B 0} 7. Be3 {B 0} Bb6 {B 0} 8. Be2 {0.14/46 61} Nd4 {0.08/12 80 (Bg4)} 9. Bxd4 {0.20/33 8} Bxd4 {0.09/15 6} 10. Nxd4 {0.14/36 11} exd4 {0.09/15 2} 11. Nd5 {0.16/35 10} Nxd5 {0.09/14 47 (Nd7)} 12. cxd5 {0.15/38 13} c6 {0.08/15 3 (Bd7)} 13. Qa4 {0.18/33 9} Qb6 {0.09/15 5} 14. Qb4 {0.13/31 7} c5 {0.09/14 19 (Ke7)} 15. Qd2 {0.12/33 22} a5 {0.08/14 7 (0-0)} 16. f4 {0.14/36 56} Bd7 {0.08/13 26} 17. O-O {0.07/34 0} O-O {0.09/13 7} 18. Rae1 {0.18/32 4} Rac8 {0.10/13 52 (c4)} 19. Bd1 {0.23/34 32} f6 {0.10/11 1} 20. b3 {0.16/39 7} Ra8 {0.09/12 69 (Qa6)} 21. h3 {0.18/37 52 (h4)} h6 {0.09/12 1 (a4)} 22. a4 {0.39/32 14 (h4)} Qb4 {0.14/14 20 (Qa6)} 23. Qe2 {0.32/35 11} Rae8 {0.17/17 24} 24. Qf3 {0.22/42 14 (Qf2)} b5 {0.01/16 18} 25. axb5 {0.19/37 0 (Qg3)} Bxb5 {0.02/13 18 (Qxb5)} 26. Qg3 {0.50/34 12} Re7 {0.02/12 1 (Qc3)} 27. e5 {0.85/33 14 (Bg4)} Qd2 {0.40/21 41} 28. exf6 {0.99/37 0} Ref7 {0.39/23 7} 29. fxg7 {0.94/38 24} Rxg7 {0.47/28 0} 30. Bg4 {0.71/40 40} Qxd3 {0.44/22 1 (h5)} 31. Rf3 {1.01/33 12 (Qh4)} Qd2 {0.24/18 43} 32. Qh4 {1.24/37 7} Bd7 {0.49/31 48 (Be2)} 33. Re6 {2.37/32 14} Bxe6 {0.57/24 1} 34. dxe6 {2.92/30 29} Qc2 {0.49/39 5 (Qd1+)} 35. f5 {3.60/31 17} Qc1+ {0.48/40 3} 36. Kh2 {3.84/34 11} Qg5 {0.46/39 10} 37. Qxg5 {3.94/32 21} Rxg5 {0.46/39 3} 38. f6 {4.18/31 10} Rb8 {0.45/37 4} 39. Bf5 {4.35/31 8 (Rf5)} a4 {1.44/20 67 (c4)} 40. bxa4 {5.45/29 13} c4 {1.84/21 6 (d3) xxxxx,Lc0 v0.29.0-rc0 resigns (Lag: Av=0.22s, max=0.7s)} 1-0[/pgn]
mehmet123
Posts: 692
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Kayra Tests

Post by mehmet123 »

Engine Tournament:

Program Elo + - Games Score Av.Op. Draws

1 Kayra 1.7 bmi2 : 2401 5 4 792 50.2 % 2400 96.6 %
2 Genko 1.0a bmi2 : 2400 4 4 792 49.9 % 2400 97.1 %
3 Stockfish 15.1 x64 bmi2 : 2399 5 5 792 49.9 % 2400 96.2 %

Individual statistics:

1 Kayra 1.7 bmi2 : 2401 792 (+ 15,=765,- 12), 50.2 %

Genko 1.0a bmi2 : 396 (+ 6,=386,- 4), 50.3 %
Stockfish 15.1 x64 bmi2 : 396 (+ 9,=379,- 8), 50.1 %

2 Genko 1.0a bmi2 : 2400 792 (+ 11,=769,- 12), 49.9 %

Kayra 1.7 bmi2 : 396 (+ 4,=386,- 6), 49.7 %
Stockfish 15.1 x64 bmi2 : 396 (+ 7,=383,- 6), 50.1 %

3 Stockfish 15.1 x64 bmi2 : 2399 792 (+ 14,=762,- 16), 49.9 %

Kayra 1.7 bmi2 : 396 (+ 8,=379,- 9), 49.9 %
Genko 1.0a bmi2 : 396 (+ 6,=383,- 7), 49.9 %


Game Conditions: Cutechess Gui, 1 Core , Core-i7 12700h (14 cores/ 20 threads), Concurrency: 18, 5 min + 1 sec TC, Balsa 5 moves Opening Book, 512 Mb Hash, Ponder Off
http://www.mediafire.com/file/281frfl1a8wuwti/c29.pgn


Even though Kayra 1.7 was the winner of this tournament, the performance of Kayra 1.7 in longer games was far below my expectations.
At Woblis Rating List (4 cores / 4 min + 2 sec) the performance of Kayra 1.7 was disappointed me. In this rating list, Kayra 1.7 was unfortunately below the other Kayra chess engines published in May 2022 (Kayra 1.4) , July 2022 (Kayra 1.5) and September 2022 (Kayra 1.6).

Rank Name Elo + - games score oppo. draws
1 Kayra 1.5 avx2 3501 11 11 2400 69% 3381 61%
2 Kayra 1.6 avx2 3494 10 10 2600 62% 3424 74%
3 Kayra 1.4 avx2 3489 11 11 2100 63% 3413 73%
4 CorChess 3 171022 3486 14 14 1500 71% 3362 58%
5 Kayra 1.7 avx2 3485 13 13 1500 52% 3475 95%
.
.
https://www.mediafire.com/folder/6b585g ... /Documents


Failure is not a step backward; it’s an excellent stepping stone to success.
mehmet123
Posts: 692
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: Kayra Tests

Post by mehmet123 »

Kayra 1.5 vs. Stockfish 15.1:

Program Elo + - Games Score Av.Op. Draws

1 Kayra 1.5 bmi2 : 2401 5 5 440 50.3 % 2399 97.5 %
2 Stockfish 15.1 x64 bmi2 : 2399 5 5 440 49.7 % 2401 97.5 %

Individual statistics:

1 Kayra 1.5 bmi2 : 2401 440 (+ 7,=429,- 4), 50.3 %

Stockfish 15.1 x64 bmi2 : 440 (+ 7,=429,- 4), 50.3 %

2 Stockfish 15.1 x64 bmi2 : 2399 440 (+ 4,=429,- 7), 49.7 %

Kayra 1.5 bmi2 : 440 (+ 4,=429,- 7), 49.7 %

Game Conditions: Cutechess Gui, 1 Core , Core-i7 12700h (14 cores/ 20 threads), Concurrency: 14, 5 min + 1 sec TC, Balsa 5 moves Opening Book, 512 Mb Hash, Ponder Off
https://www.mediafire.com/file/9alnu69o ... 2.pgn/file

Kayra 1.5 chess engine, which was released in July 2022, showed the success of being unbeaten against Stockfish 15.1 in this match.