Komodo 13.01 MCTS - chess960 results and site updated

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by Laskos »

lkaufman wrote: Tue May 14, 2019 4:30 am
pohl4711 wrote: Mon May 13, 2019 11:09 am
Modern Times wrote: Sun May 12, 2019 6:35 am Yes FRC is the prime candidate.

The 50 positions was from the Rybka forum more than 10 years ago:

"Stefan Pohl has created a set of 50 shuffle chess positions, which can be used as an opening database for testing. There are NO castlings possible, and the set does NOT contain any Chess960 positions. 50 positions allow engine matches with 100 games (with switching sides per position)."

Anyway I stuck with chess960.
How about trying my Drawkiller openings?!

https://www.sp-cc.de/drawkiller-openings.htm

They look like this: (Kings on opposite side of the board, Queens not on the same line, all non-pawns still on row 1/8. Only 5 of 16 non-pawns not on their normal position...
That gives very, very low draw rates...(nearly halved, compared to classical opening sets) take a look at the testing results on my Drawkiller-site. So, Drawkiller is much better, than FRC. And Drawkiller has much more different positions, than FRC.)

[d]knbrqbnr/ppp2pp1/4p2p/3p4/3P4/4PP2/PPP3PP/RNBQRBNK w - - 0 0
The reason I don't like your Drawkiller openings (either for testing or to play myself) is that opposite-side castling is not the norm in chess. I'm sure they make for interesting games, but the emphasis on pawn-storms and direct attacks on the king makes it too different from normal chess for me. It's just not representative of standard chess, where positional play is critical. So far my tests indicate that the 18 positions from chess960 that allow normal castling on both sides are also much less drawish than normal chess, without the above objection. But, only 18 positions, too few!
Well, not that few. That's interesting, actually. From 18, using Lc0 at temperature say 0.3 or 0.4, and regular Komodo (with variety) on say 4 threads (both engines randomize well and consistently), a 2-mover (4 plies) book can contain in excess of 1,000 unique, reasonable, fairly balanced positions, and that's not bad as an opening set.
jp
Posts: 1470
Joined: Mon Apr 23, 2018 7:54 am

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by jp »

lkaufman wrote: Mon May 13, 2019 4:37 am This is the chess18 I mentioned in an earlier post. I've already run some engine matches with it and noticed a low draw percentage.
What win percentages did you get for White & Black? I guess you probably didn't run enough games for us to draw firm conclusions, though.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by Laskos »

Laskos wrote: Tue May 14, 2019 6:57 am
lkaufman wrote: Tue May 14, 2019 4:30 am
pohl4711 wrote: Mon May 13, 2019 11:09 am
Modern Times wrote: Sun May 12, 2019 6:35 am Yes FRC is the prime candidate.

The 50 positions was from the Rybka forum more than 10 years ago:

"Stefan Pohl has created a set of 50 shuffle chess positions, which can be used as an opening database for testing. There are NO castlings possible, and the set does NOT contain any Chess960 positions. 50 positions allow engine matches with 100 games (with switching sides per position)."

Anyway I stuck with chess960.
How about trying my Drawkiller openings?!

https://www.sp-cc.de/drawkiller-openings.htm

They look like this: (Kings on opposite side of the board, Queens not on the same line, all non-pawns still on row 1/8. Only 5 of 16 non-pawns not on their normal position...
That gives very, very low draw rates...(nearly halved, compared to classical opening sets) take a look at the testing results on my Drawkiller-site. So, Drawkiller is much better, than FRC. And Drawkiller has much more different positions, than FRC.)

[d]knbrqbnr/ppp2pp1/4p2p/3p4/3P4/4PP2/PPP3PP/RNBQRBNK w - - 0 0
The reason I don't like your Drawkiller openings (either for testing or to play myself) is that opposite-side castling is not the norm in chess. I'm sure they make for interesting games, but the emphasis on pawn-storms and direct attacks on the king makes it too different from normal chess for me. It's just not representative of standard chess, where positional play is critical. So far my tests indicate that the 18 positions from chess960 that allow normal castling on both sides are also much less drawish than normal chess, without the above objection. But, only 18 positions, too few!
Well, not that few. That's interesting, actually. From 18, using Lc0 at temperature say 0.3 or 0.4, and regular Komodo (with variety) on say 4 threads (both engines randomize well and consistently), a 2-mover (4 plies) book can contain in excess of 1,000 unique, reasonable, fairly balanced positions, and that's not bad as an opening set.
I am attaching the EPD file of 2-mover 1,000+ unique reasonable openings built with Komodo from these 18 positions. I didn't use Lc0 because I am not sure what temperature I should put to stay safe from bad moves but still to have diversity. Komodo on 4 threads at 0.5s/move doesn't play stupid 2 movers, but has enough diversity.

First, if someone could check the 18 openings, I just put them by hand in an EPD file, hope I didn't do something stupid:

Code: Select all

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
rbnqknbr/pppppppp/8/8/8/8/PPPPPPPP/RBNQKNBR w KQkq - 0 1
rbbqknnr/pppppppp/8/8/8/8/PPPPPPPP/RBBQKNNR w KQkq - 0 1
rnnqkbbr/pppppppp/8/8/8/8/PPPPPPPP/RNNQKBBR w KQkq - 0 1
rnbbkqnr/pppppppp/8/8/8/8/PPPPPPPP/RNBBKQNR w KQkq - 0 1
rbbnkqnr/pppppppp/8/8/8/8/PPPPPPPP/RBBNKQNR w KQkq - 0 1
rnnbkqbr/pppppppp/8/8/8/8/PPPPPPPP/RNNBKQBR w KQkq - 0 1
rbnnkqbr/pppppppp/8/8/8/8/PPPPPPPP/RBNNKQBR w KQkq - 0 1
rnbnkbqr/pppppppp/8/8/8/8/PPPPPPPP/RNBNKBQR w KQkq - 0 1
rnbbknqr/pppppppp/8/8/8/8/PPPPPPPP/RNBBKNQR w KQkq - 0 1
rbbnknqr/pppppppp/8/8/8/8/PPPPPPPP/RBBNKNQR w KQkq - 0 1
rnqnkbbr/pppppppp/8/8/8/8/PPPPPPPP/RNQNKBBR w KQkq - 0 1
rbqnknbr/pppppppp/8/8/8/8/PPPPPPPP/RBQNKNBR w KQkq - 0 1
rnqbknbr/pppppppp/8/8/8/8/PPPPPPPP/RNQBKNBR w KQkq - 0 1
rqbnkbnr/pppppppp/8/8/8/8/PPPPPPPP/RQBNKBNR w KQkq - 0 1
rqnbknbr/pppppppp/8/8/8/8/PPPPPPPP/RQNBKNBR w KQkq - 0 1
rqnnkbbr/pppppppp/8/8/8/8/PPPPPPPP/RQNNKBBR w KQkq - 0 1
rqbbknnr/pppppppp/8/8/8/8/PPPPPPPP/RQBBKNNR w KQkq - 0 1
Here is the EPD 2-mover opening file of reasonable 1,000+ positions from those 18. Will check later the draw rates and such.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by lkaufman »

jp wrote: Tue May 14, 2019 7:49 am
lkaufman wrote: Mon May 13, 2019 4:37 am This is the chess18 I mentioned in an earlier post. I've already run some engine matches with it and noticed a low draw percentage.
What win percentages did you get for White & Black? I guess you probably didn't run enough games for us to draw firm conclusions, though.
No, not enough games, but the initial evals showed that the White advantage averaged about the same as in normal chess, with a fairly wide range from best to worst.
Komodo rules!
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by lkaufman »

Laskos wrote: Tue May 14, 2019 11:22 am
Laskos wrote: Tue May 14, 2019 6:57 am
lkaufman wrote: Tue May 14, 2019 4:30 am
pohl4711 wrote: Mon May 13, 2019 11:09 am
Modern Times wrote: Sun May 12, 2019 6:35 am Yes FRC is the prime candidate.

The 50 positions was from the Rybka forum more than 10 years ago:

"Stefan Pohl has created a set of 50 shuffle chess positions, which can be used as an opening database for testing. There are NO castlings possible, and the set does NOT contain any Chess960 positions. 50 positions allow engine matches with 100 games (with switching sides per position)."

Anyway I stuck with chess960.
How about trying my Drawkiller openings?!

https://www.sp-cc.de/drawkiller-openings.htm

They look like this: (Kings on opposite side of the board, Queens not on the same line, all non-pawns still on row 1/8. Only 5 of 16 non-pawns not on their normal position...
That gives very, very low draw rates...(nearly halved, compared to classical opening sets) take a look at the testing results on my Drawkiller-site. So, Drawkiller is much better, than FRC. And Drawkiller has much more different positions, than FRC.)

[d]knbrqbnr/ppp2pp1/4p2p/3p4/3P4/4PP2/PPP3PP/RNBQRBNK w - - 0 0
The reason I don't like your Drawkiller openings (either for testing or to play myself) is that opposite-side castling is not the norm in chess. I'm sure they make for interesting games, but the emphasis on pawn-storms and direct attacks on the king makes it too different from normal chess for me. It's just not representative of standard chess, where positional play is critical. So far my tests indicate that the 18 positions from chess960 that allow normal castling on both sides are also much less drawish than normal chess, without the above objection. But, only 18 positions, too few!
Well, not that few. That's interesting, actually. From 18, using Lc0 at temperature say 0.3 or 0.4, and regular Komodo (with variety) on say 4 threads (both engines randomize well and consistently), a 2-mover (4 plies) book can contain in excess of 1,000 unique, reasonable, fairly balanced positions, and that's not bad as an opening set.
I am attaching the EPD file of 2-mover 1,000+ unique reasonable openings built with Komodo from these 18 positions. I didn't use Lc0 because I am not sure what temperature I should put to stay safe from bad moves but still to have diversity. Komodo on 4 threads at 0.5s/move doesn't play stupid 2 movers, but has enough diversity.

First, if someone could check the 18 openings, I just put them by hand in an EPD file, hope I didn't do something stupid:

Code: Select all

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
rbnqknbr/pppppppp/8/8/8/8/PPPPPPPP/RBNQKNBR w KQkq - 0 1
rbbqknnr/pppppppp/8/8/8/8/PPPPPPPP/RBBQKNNR w KQkq - 0 1
rnnqkbbr/pppppppp/8/8/8/8/PPPPPPPP/RNNQKBBR w KQkq - 0 1
rnbbkqnr/pppppppp/8/8/8/8/PPPPPPPP/RNBBKQNR w KQkq - 0 1
rbbnkqnr/pppppppp/8/8/8/8/PPPPPPPP/RBBNKQNR w KQkq - 0 1
rnnbkqbr/pppppppp/8/8/8/8/PPPPPPPP/RNNBKQBR w KQkq - 0 1
rbnnkqbr/pppppppp/8/8/8/8/PPPPPPPP/RBNNKQBR w KQkq - 0 1
rnbnkbqr/pppppppp/8/8/8/8/PPPPPPPP/RNBNKBQR w KQkq - 0 1
rnbbknqr/pppppppp/8/8/8/8/PPPPPPPP/RNBBKNQR w KQkq - 0 1
rbbnknqr/pppppppp/8/8/8/8/PPPPPPPP/RBBNKNQR w KQkq - 0 1
rnqnkbbr/pppppppp/8/8/8/8/PPPPPPPP/RNQNKBBR w KQkq - 0 1
rbqnknbr/pppppppp/8/8/8/8/PPPPPPPP/RBQNKNBR w KQkq - 0 1
rnqbknbr/pppppppp/8/8/8/8/PPPPPPPP/RNQBKNBR w KQkq - 0 1
rqbnkbnr/pppppppp/8/8/8/8/PPPPPPPP/RQBNKBNR w KQkq - 0 1
rqnbknbr/pppppppp/8/8/8/8/PPPPPPPP/RQNBKNBR w KQkq - 0 1
rqnnkbbr/pppppppp/8/8/8/8/PPPPPPPP/RQNNKBBR w KQkq - 0 1
rqbbknnr/pppppppp/8/8/8/8/PPPPPPPP/RQBBKNNR w KQkq - 0 1
Here is the EPD 2-mover opening file of reasonable 1,000+ positions from those 18. Will check later the draw rates and such.
Thanks, Kai. I am pleasantly surprised that you can get that many positions just using Komodo MP 4 and a small variety setting for two moves. That means that every move in the book has a real chance of being the "best" move, so White should retain roughly his initial advantage in all the lines. If this really cuts the draw percentage significantly compared to normal books, we might start using it. The reason it should help is that the first few moves of a game are rather critical, and if the engine has to make them rather than a book there is more chance of the game approaching the win/draw line in the opening.
Komodo rules!
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by Laskos »

lkaufman wrote: Tue May 14, 2019 8:23 pm
Thanks, Kai. I am pleasantly surprised that you can get that many positions just using Komodo MP 4 and a small variety setting for two moves. That means that every move in the book has a real chance of being the "best" move, so White should retain roughly his initial advantage in all the lines. If this really cuts the draw percentage significantly compared to normal books, we might start using it. The reason it should help is that the first few moves of a game are rather critical, and if the engine has to make them rather than a book there is more chance of the game approaching the win/draw line in the opening.
I measured the Signal to Noise ratio (S/N ratio) by my usual metric for the opening books. The higher S/N ratio is, the better (more sensitive) is the opening suite. There is another discussion about pentanomial variance and errors with paired openings (white-black), but here let's use usual trinomial errors, as the suites (books) are pretty regular and balanced.

Stockfish_dev at 6''+0.06'' vs Stockfish_dev at 3''+0.03''

Error in S/N: roughly 0.5(1 standard deviation).
In parenthesis is the number of positions in a suite.



2moves_v1 (40454)
Score of SF_dev2 vs SF_dev1: 1141 - 129 - 730 [0.753] 2000
Elo difference: 193.64 +/- 12.59
Finished match
S/N: 15.4


Chess18_Openings (1077)
Score of SF_dev2 vs SF_dev1: 1139 - 134 - 727 [0.751] 2000
Elo difference: 192.01 +/- 12.61
S/N: 15.2


4moves_GM (2668)
Score of SF_dev2 vs SF_dev1: 1089 - 143 - 768 [0.736] 2000
Elo difference: 178.53 +/- 12.28
Finished match
S/N: 14.5


3moves_Elo2200 (6533)
Score of SF_dev2 vs SF_dev1: 1090 - 126 - 784 [0.741] 2000
Elo difference: 182.61 +/- 12.15
Finished match
S/N: 15.0


Drawkiller tournament (6848)
Score of SF_dev2 vs SF_dev1: 1220 - 218 - 562 [0.750] 2000
Elo difference: 191.31 +/- 13.90
Finished match
S/N: 13.8


Drawkiller balanced small500 (500)
Score of SF_dev2 vs SF_dev1: 1228 - 199 - 573 [0.757] 2000
Elo difference: 197.63 +/- 13.87
S/N: 14.2


8moves_GM (48491)
Score of SF_dev2 vs SF_dev1: 1017 - 156 - 827 [0.715] 2000
Elo difference: 159.82 +/- 11.82
Finished match
S/N: 13.5


2moves_v1 is the SF testing framework file, and it has the shortcoming that most of the positions are weird, being random. Drawkiller books come with low draw ratio indeed, but low S/N too, and they are too a bit artificial. 8moves_GM is a typical file of openings people often use, and it has significantly lower S/N ratio.

The competition seems to be between 3moves_Elo2200, built from human games, both players being above FIDE Elo above 2200, and Chess18 openings. The positions in these suites are not random or stupid, and are representative and fairly balanced, although Chess18 is not really Chess. I can share 3moves_Elo2200, as I myself built it:
3moves_Elo2200.zip



I will compare further these two suites for "sensitivity" in a bit different conditions.
User avatar
Nordlandia
Posts: 2821
Joined: Fri Sep 25, 2015 9:38 pm
Location: Sortland, Norway

Re: Komodo 13.01 MCTS - chess960 results and site updated

Post by Nordlandia »

It appears white has a considerable advantage in the drawkiller opening position.

Based on my testing so far. The name is surely living up to it's name by far.

i7-5960X 4.5GHz | 8-Core | 4096 Mb per engine | 5-men probing + automatic 6-men adjudication

[pgn][Event "i7-5960X 4.5GHz"] [Site "i7-5960X 4.5GHz"] [Date "2019.05.17"] [Round "?"] [White "stockfish_10_x64_bmi2"] [Black "stockfish_x64_bmi2"] [Result "1-0"] [SetUp "1"] [FEN "knbrqbnr/ppp1pppp/3p4/8/8/4P3/PPPP1PPP/RNBQRBNK w - - 0 11"] [PlyCount "142"] [EventDate "2019.??.??"] [TimeControl "1800+30"] 11. e4 {0.74/33 88s} e5 {0.00/38 397s} 12. d4 {0.71/35 42} f5 {-0.02/32 31} 13. exf5 {0.73/33 53} Nf6 {-0.53/33 93s} 14. Nc3 {0.76/35 138s} Bxf5 {-0.11/31 33} 15. Nf3 {0.56/33 57} Be7 {-0.13/33 90s} 16. h3 {0.93/30 47} h6 {-0.20/31 24} 17. Bb5 {0.44/34 178s} Qf7 {0.00/36 71s} 18. Bd3 {0.45/35 56} Bxd3 {0.00/36 27} 19. cxd3 {0.48/35 64s} exd4 {0.00/36 34} 20. Nxd4 {0.42/36 58} Rd7 {0.00/35 31} 21. Qa4 {0.67/35 89s} Nd5 {-0.18/35 42} 22. Nxd5 {0.52/36 37} Qxd5 {0.00/36 34} 23. Ne6 {0.64/37 71s} Nc6 {0.00/38 33} 24. Rb1 {0.39/39 145s} Bg5 {0.00/37 37} 25. Nxg5 {0.62/38 43} hxg5 {-0.08/36 35} 26. Kg1 {0.67/36 22} Rf7 {0.00/38 37} 27. Qc4 {0.50/36 60s} Qf5 {0.00/41 37} 28. f3 {0.48/38 68s} a6 {0.00/42 55} 29. Qe6 {0.53/42 258s} Rhf8 {0.00/40 44} 30. Qxf5 {0.51/41 52} Rxf5 {0.00/42 56} 31. Re4 {0.59/39 65s} Ra5 {0.00/42 42} 32. Ra1 {0.32/34 46} Rd5 {0.00/43 75s} 33. Be3 {0.42/37 106s} Rxd3 {0.00/42 49} 34. Bxg5 {0.49/34 28} Rd5 {0.00/41 134s} 35. h4 {0.70/35 32} b5 {0.00/41 92s} 36. Rf4 {0.87/39 194s} Rxf4 { -0.73/41 222s} 37. Bxf4 {0.95/33 23} Kb7 {-0.37/38 75s} 38. g4 {1.01/34 22} Nb4 {-0.50/37 47} 39. h5 {1.01/39 209s} Kc8 {-0.74/42 363s} 40. Kf1 {1.04/34 15} Kd7 {-0.16/33 16} 41. Ke2 {1.17/33 25} Ke6 {-0.40/36 59} 42. a3 {0.95/38 44} Nd3 {-0.83/38 62s} 43. Bd2 {1.23/37 21} Kf7 {-0.69/36 49} 44. Bc3 {1.24/38 32} Nc5 {-0.94/37 28} 45. Rc1 {1.01/38 46} Na4 {-0.92/37 32} 46. Bd2 {1.00/39 38} Re5+ {-0.87/35 38} 47. Be3 {1.17/38 36} Re7 {-1.00/41 152s} 48. b3 {1.49/35 38} Nb6 {-1.07/36 16} 49. Rd1 {1.62/35 29} c6 {-1.07/37 38} 50. Kf2 {1.59/34 33} Nd5 {-1.15/37 29} 51. Bd2 {1.81/32 45} Rb7 {-0.95/36 42} 52. f4 {1.94/33 39} a5 {-1.62/37 89s} 53. Bxa5 {2.27/37 27} Ra7 {-1.72/37 18} 54. Bb4 {2.53/36 21} Nf6 {-1.53/42 14} 55. g5 {2.49/38 29} Nxh5 {-2.33/42 83s} 56. Rxd6 {2.54/37 17} Nxf4 {-2.40/35 22} 57. Ke3 {2.64/36 22} Nd5+ {-2.22/39 32} 58. Kd4 {2.75/39 37} Ne7 {-2.22/42 27} 59. Ke4 {2.96/36 18} Rc7 {-2.22/41 18} 60. Bc5 {3.00/37 38} Ng6 {-2.22/43 21} 61. a4 {3.21/36 20} bxa4 {-2.40/43 46} 62. bxa4 {3.30/39 26} Re7+ {-2.22/44 20} 63. Kd4 {3.38/36 24} Re1 {-3.12/35 39} 64. Rxc6 {3.50/37 42} Rd1+ {-3.64/37 58} 65. Kc3 {3.52/40 21} Ra1 {-3.81/39 31} 66. Kb4 {3.54/44 26} Rb1+ {-3.81/39 13} 67. Ka5 {3.62/46 52} Ne5 {-4.12/42 39} 68. Rc7+ {3.72/46 104s} Kg6 {-4.12/40 13} 69. Bd4 {3.90/42 19} Nf7 {-4.31/43 31} 70. Rc6+ { 4.35/37 36} Kf5 {-4.31/44 19} 71. g6 {4.72/36 19} Nh8 {-4.78/45 62s} 72. Bxg7 { 4.93/36 18} Nxg6 {-5.31/42 33} 73. Rf6+ {5.22/34 19} Kg5 {-5.31/42 12} 74. Rb6 {5.68/34 30} Rd1 {-5.36/39 18} 75. Kb5 {5.96/37 23} Kf5 {-5.97/43 58} 76. a5 { 7.31/38 58} Rd5+ {-5.97/40 11} 77. Ka4 {7.50/40 17} Rd7 {-6.25/47 43} 78. Bb2 { 7.94/39 31} Ne7 {-6.63/47 37} 79. a6 {8.56/34 21} Nc8 {-7.29/39 30} 80. Rf6+ { 9.80/34 36} Ke4 {-7.95/36 26} 81. Kb5 {11.63/33 50} Ra7 {adjudication -8.87/34 34s, White wins by adjudication} 1-0 [/pgn]


Brief lichess server analysis - https://lichess.org/9RtHt2XL#20