Komodo-Dragon-2 vs Stockfish 14 at knight odss

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

lkaufman
Posts: 6281
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by lkaufman »

Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
Komodo rules!
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Chessqueen »

lkaufman wrote: Wed Sep 22, 2021 7:45 pm
Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
Yes one thread, since all the inferior engines are also using 1 thread. What is Ed getting for Knight Odds, it should be at least 100 points more than with Rook odds.

Most of the games against Knockout where like this one

[pgn][Event "Rook Odds"]
[Site "MININT-UB2PIMJ"]
[Date "2021.09.22"]
[Round "?"]
[White "Dragon-2-64bit-avx2"]
[Black "Knockout"]
[Result "0-1"]
[BlackElo "2153"]
[Time "13:29:19"]
[WhiteElo "3575"]
[TimeControl "0+1"]
[SetUp "1"]
[FEN "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/1NBQKBNR w Kkq - 0 1"]
[Termination "normal"]
[PlyCount "171"]
[WhiteType "program"]
[BlackType "program"]

1. Nf3 Nc6 2. e4 Nf6 3. e5 Nd5 4. c4 Nb6 5. d4 d6 6. e6 Bxe6 7. d5 Bf5 8.
dxc6 Bxb1 9. cxb7 Rb8 10. c5 Nd7 11. c6 Nb6 12. Bg5 Be4 13. Nd4 h6 14. Bh4
Bd5 15. f3 g5 16. Bf2 e5 17. Bb5 Bxa2 18. b3 exd4 19. Bxd4 Rg8 20. O-O Bg7
21. Bxg7 Rxg7 22. Qc2 Bxb3 23. Qxb3 g4 24. Qa3 gxf3 25. Rxf3 Rg5 26. Qa1
Re5 27. Qa2 Re1+ 28. Bf1 d5 29. Qc2 Qd6 30. g3 Qe6 31. Qc5 h5 32. Qa5 a6
33. Kf2 Rxf1+ 34. Kxf1 Qxc6 35. Qe1+ Kf8 36. Qe5 Kg8 37. Qxh5 Qg6 38. Qh3
Rxb7 39. Rf4 Qd3+ 40. Kg1 Qe3+ 41. Kf1 Qc1+ 42. Kg2 Qd2+ 43. Rf2 Qd3 44.
Rf4 Qe2+ 45. Kg1 Nd7 46. Qf5 Qe3+ 47. Kf1 Qe6 48. Qg5+ Kf8 49. Qd8+ Qe8 50.
Qxe8+ Kxe8 51. h4 Rb1+ 52. Kg2 Nc5 53. h5 Ne4 54. Rf5 c6 55. h6 Rb2+ 56.
Kh3 Ke7 57. g4 Rb3+ 58. Kh2 Rb8 59. Kg2 Rh8 60. g5 Rg8 61. h7 Rh8 62. Kf3
Rxh7 63. Re5+ Kd6 64. Re8 Nxg5+ 65. Ke2 Rh1 66. Rg8 Ne6 67. Rg3 Kc7 68. Kd2
Kb8 69. Rb3+ Ka7 70. Rf3 Ng5 71. Rf4 Kb6 72. Ke3 Rh2 73. Rb4+ Kc7 74. Rb1
Ne6 75. Rd1 Kb8 76. Kd3 Ka7 77. Kc3 a5 78. Rf1 Ng5 79. Rg1 Rh3+ 80. Kc2 Ne6
81. Ra1 Kb6 82. Rb1+ Ka6 83. Rg1 Kb7 84. Ra1 Rh2+ 85. Kc3 Ka6 86. Kd3
{White resigns} *[/pgn]
Last edited by Chessqueen on Wed Sep 22, 2021 8:48 pm, edited 1 time in total.
User avatar
Rebel
Posts: 7468
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

KNIGHT ODDS - 2500 elo engines

Stockfish 14

Code: Select all

KNIGHT odds match Stockfish vs a pool of 2500 elo rated engines
Time Control : Time control : 40/40
Games        : 600

Results from file all.pgn:
 
No. Name           Win Draw Loss Unf.  Score Games       %
----------------------------------------------------------
  1 Stockfish 14  +432  =32 -136   *0  448.0   600   74.7%
  2 Foxsee 7.20.1  +30   =6  -64   *0   33.0   100   33.0%
  3 Monolith 0.3   +26   =5  -69   *0   28.5   100   28.5%
  4 Loki 3.5.0     +24   =8  -68   *0   28.0   100   28.0%
  5 Nalwald 1.8.1  +26   =3  -71   *0   27.5   100   27.5%
  6 Marvin 2.0     +17   =5  -78   *0   19.5   100   19.5%
  7 CT800 1.43     +13   =5  -82   *0   15.5   100   15.5%

Total Games:     600
White Wins:      132 (22.0%)
Black Wins:      436 (72.7%)
Draws:            32 (5.3%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2500 pool

   # PLAYER           :  RATING  POINTS  PLAYED   (%)
   1 Stockfish 14     :  2666.6   448.0     600    75
   2 Foxsee 7.20.1    :  2542.5    33.0     100    33
   3 Monolith 0.3     :  2505.4    28.5     100    29
   4 Loki 3.5.0       :  2501.1    28.0     100    28
   5 Nalwald 1.8.1    :  2496.7    27.5     100    28
   6 Marvin 2.0       :  2418.2    19.5     100    20
   7 CT800 1.43       :  2369.4    15.5     100    16
Komodo-Dragon 2

Code: Select all

KNIGHT odds match Komodo Dragon 2 vs a pool of 2500 elo rated engines
Time Control : Time control : 40/40
Games        : 600

Results from file all.pgn:

No. Name             Win Draw Loss Unf.  Score Games       %
------------------------------------------------------------
  1 Komodo-Dragon 2 +505  =44  -51   *0  527.0   600   87.8%
  2 Loki 3.5.0       +13   =9  -78   *0   17.5   100   17.5%
  3 Nalwald 1.8.1    +12   =9  -79   *0   16.5   100   16.5%
  4 Foxsee 7.20.1    +11   =5  -84   *0   13.5   100   13.5%
  5 Monolith 0.3      +5  =13  -82   *0   11.5   100   11.5%
  6 Marvin 2.0        +8   =5  -87   *0   10.5   100   10.5%
  7 CT800 1.43        +2   =3  -95   *0    3.5   100    3.5%

Total Games:     600
White Wins:      205 (34.2%)
Black Wins:      351 (58.5%)
Draws:            44 (7.3%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2500 pool

   # PLAYER             :  RATING  POINTS  PLAYED   (%)
   1 Komodo-Dragon 2    :  2813.7   527.0     600    88
   2 Loki 3.5.0         :  2541.9    17.5     100    18
   3 Nalwald 1.8.1      :  2529.5    16.5     100    17
   4 Foxsee 7.20.1      :  2488.2    13.5     100    14
   5 Monolith 0.3       :  2456.1    11.5     100    12
   6 Marvin 2.0         :  2438.2    10.5     100    11
   7 CT800 1.43         :  2232.4     3.5     100     4
Komodo : 87.8%
Stockfish : 74.7%


Next, Bishop odds, same URL - http://rebel13.nl/a/grl.htm
90% of coding is debugging, the other 10% is writing bugs.
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Chessqueen »

lkaufman wrote: Wed Sep 22, 2021 7:45 pm
Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
And against Stockfish14 Knockout finished Stockfish 14 much faster the majority of the time

[pgn][Event "Computer chess game"]
[Site "MININT-UB2PIMJ"]
[Date "2021.09.22"]
[Round "?"]
[White "Stockfish_14_x64_bmi2"]
[Black "Knockout"]
[Result "0-1"]
[BlackElo "2153"]
[Time "13:46:34"]
[WhiteElo "3600"]
[TimeControl "0+1"]
[SetUp "1"]
[FEN "rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/1NBQKBNR w Kkq - 0 1"]
[Termination "normal"]
[PlyCount "79"]
[WhiteType "program"]
[BlackType "program"]

1. d4 d5 2. Nf3 e6 3. e3 Nf6 4. c3 Bd6 5. Be2 O-O 6. O-O Nbd7 7. Nbd2 c6 8.
b4 e5 9. dxe5 Nxe5 10. Bb2 Nxf3+ 11. Nxf3 Qe7 12. c4 dxc4 13. Bxc4 Bf5 14.
a3 Rfd8 15. Qb3 a5 16. Rc1 Ne4 17. h4 axb4 18. axb4 b5 19. Bd3 Bxb4 20. Qc2
Ra2 21. Rb1 Rxd3 22. Qxd3 Ng3 23. Qd1 Ne2+ 24. Kf1 Nc3 25. Bxc3 Bxc3 26.
Rc1 b4 27. Qb3 Be6 28. Qb1 h6 29. Kg1 Qf6 30. Qd3 c5 31. e4 c4 32. Qe3 Bb2
33. Re1 b3 34. g3 Be5 35. Rb1 c3 36. Kg2 b2 37. Nxe5 Qxe5 38. Qd3 Ra1 39.
Qc2 Qd4 40. h5 {White resigns} *[/pgn]
Guenther
Posts: 4718
Joined: Wed Oct 01, 2008 6:33 am
Location: Regensburg, Germany
Full name: Guenther Simon

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Guenther »

Chessqueen wrote: Wed Sep 22, 2021 8:52 pm
[TimeControl "0+1"]
It's all because of your exotic time control, which was clear when you posted your first game with that tc,
but it seems no one else noticed...

(and now you have polluted Ed's thread totally)
https://rwbc-chess.de

[Trolls n'existent pas...]
Uri Blass
Posts: 11148
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Uri Blass »

lkaufman wrote: Wed Sep 22, 2021 7:45 pm
Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
Something seems to be wrong with Ed's results because looking at the games fruit2.1 get clearly bigger depths at the same time with my hardware
and I do not have a fast hardware.

I did not look only at games when fruit lost but also at games when fruit won.
Uri Blass
Posts: 11148
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Uri Blass »

Uri Blass wrote: Wed Sep 22, 2021 9:14 pm
lkaufman wrote: Wed Sep 22, 2021 7:45 pm
Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
Something seems to be wrong with Ed's results because looking at the games fruit2.1 get clearly bigger depths at the same time with my hardware
and I do not have a fast hardware.

I did not look only at games when fruit lost but also at games when fruit won.
I see that it is probably my mistake because arena simply wrote wrong times when I copied a game to arena to look at it.
User avatar
Rebel
Posts: 7468
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

Guenther wrote: Wed Sep 22, 2021 9:00 pm
Chessqueen wrote: Wed Sep 22, 2021 8:52 pm
[TimeControl "0+1"]
It's all because of your exotic time control, which was clear when you posted your first game with that tc,
but it seems no one else noticed...

(and now you have polluted Ed's thread totally)
Indeed.

I keep copies here.
90% of coding is debugging, the other 10% is writing bugs.
Chessqueen
Posts: 5685
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Chessqueen »

Uri Blass wrote: Wed Sep 22, 2021 9:28 pm
Uri Blass wrote: Wed Sep 22, 2021 9:14 pm
lkaufman wrote: Wed Sep 22, 2021 7:45 pm
Chessqueen wrote: Wed Sep 22, 2021 5:38 pm
lkaufman wrote: Wed Sep 22, 2021 7:31 am
Rebel wrote: Wed Sep 22, 2021 7:15 am
lkaufman wrote: Wed Sep 22, 2021 2:43 am So bishops are indeed worth more than knights (at least when bishop pair is broken for the side losing the bishop), no surprise there. But it is interesting that Stockfish lost much more than Komodo from this, SF score was nearly cut in half going from knight odds to bishop odds! Regarding rook odds, it is roughly a class (200 elo) larger handicap than knight odds, so a field in the 2500 to 2530 range for opponents might be more balanced, but anyway it will be interesting.
At the moment I am doing queen odds, just to be complete.

The rook epd is not good, see:

Code: Select all

rnbqkb1r/ppp1pppp/3p4/3nP3/3P4/8/PPP2PPP/1NBQKBNR w KQkq - 0 4; v=-526
r1bqkb1r/pppnpppp/3p1n2/8/2PP4/2N5/PP2PPPP/2BQKBNR w KQkq - 2 4; v=-529
rnbqk1nr/ppp1ppbp/3p2p1/8/3PP3/5N2/PPP2PPP/1NBQKB1R w KQkq - 0 4; v=-536
r1bqkb1r/ppp1pppp/2n2n2/3p4/2PP4/4P3/PP3PPP/1NBQKBNR w KQkq - 1 4; v=-538
Castling flags are wrong and positions are ignored by cute.

Does somebody has a good rook odds epd of (at least) 100 positions?
At queen odds, if you keep the same field, I don't think that either Dragon or Stockfish will get more than a few draws in 700 games, maybe not even that. Probably you need engines about a thousand elo lower than these for a reasonably close match at queen odds even at this bullet tc.
In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Queen Odds

462 Casper rev4 64-bit 1579 +24 −24 50.2% −2.6 23.9% 624
60.5%
463 PolarChess 1.3 1574 +25 −25 49.3% +3.5 16.9% 629
56.3%
464‑465 Darky 0.5d 1571 +24 −24 43.5% +56.2 20.8% 677
49.7%
464‑465 Storm 0.6 1571 +21 −21 38.3% +95.3 15.0% 925
74.3%
466 Damas 9 1560 +25 −25 47.8% +17.1 17.6% 626
76.1%
467 IQ23.003 1547 +21 −21 33.6% +118.6 32.6% 854
72.6%
468 Cicada 0.1 64-bit 1536 +24 −25 44.1% +48.6 19.7% 636
65.6%


In order to get an even score with Dragon2, I tested more than 1000 games at an average of 1 seconds per move with these field with Rook Odds

357 Kurt 0.9.2.2 64-bit 2166 +21 −21 47.7% +18.0 24.2% 828
56.9%
358 ProChess 1.02AD 2164 +19 −19 48.4% +6.3 22.1% 1043
51.8%
359 Chesley r323 64-bit 2163 +22 −22 48.1% +14.4 20.8% 804
51.3%
360 Micah 1.0 64-bit 2162 +25 −25 44.0% +45.9 25.4% 566
73.2%
361 KnockOut 0.7.1 2153 +15 −15 51.1% −8.0 26.6% 1574
81.4%
Something seems wrong with your rook odds results. You are saying that engines with CCRL blitz ratings in the 2160s scored about 47% with Dragon2 at rook odds at 40 moves in 40 seconds or something similar? Is this on one thread? This seems totally out of line with the performance ratings Ed reports at rook odds for Dragon2. He was getting 2545 at rook odds, you are getting under 2200. This is crazy.
Something seems to be wrong with Ed's results because looking at the games fruit2.1 get clearly bigger depths at the same time with my hardware
and I do not have a fast hardware.

I did not look only at games when fruit lost but also at games when fruit won.
I see that it is probably my mistake because arena simply wrote wrong times when I copied a game to arena to look at it.
It could also be the Ed is using more than 1 thread whereas I am only using 1 thread, to make it fair to the inferior engines competing against either Komodo Dragon or Stockfish 14. Also I use time per move 1 second and that could yield a different result, even if it average 1 second per move, at TC Time control : 40/40 .
cc2150dx
Posts: 438
Joined: Sat Nov 30, 2013 9:51 am
Full name: Jason Coombs

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by cc2150dx »

@Rebel, sorry for polluting the thread but I must say you're doing a fantastic job with this. With all the time/resources it takes, keep up the good work :)
Play + Study + Think + Learn + Analyze = Chess!!