Komodo-Dragon-2 vs Stockfish 14 at knight odss

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

Code: Select all

Knight odds        Pool    Pool    Pool
Engine             2700    2500    2300
Komodo Dragon 2.5  61.4    
Komodo Dragon 2    55.6    73.8    89.9
Dragon 2.5 scores almost 6% more ~40 elo. I will compile a stronger pool to get the wished ~50%.

Code: Select all

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2700 elo rated engines
Time Control : Time control : 40/40
Games        : 700

Results from file all.pgn:

No. Name               Win Draw Loss Unf.  Score Games       %
--------------------------------------------------------------
  1 Komodo-Dragon 2.5 +388  =84 -228   *0  430.0   700   61.4%
  2 ProDeo 2.2         +48   =9  -43   *0   52.5   100   52.5%
  3 Benjamin 1.0       +40  =12  -48   *0   46.0   100   46.0%
  4 k2 099             +33  =12  -55   *0   39.0   100   39.0%
  5 Velvet 1.2.0       +32  =12  -56   *0   38.0   100   38.0%
  6 Dumb 1.8           +29  =11  -60   *0   34.5   100   34.5%
  7 Zahak 5.0          +28   =8  -64   *0   32.0   100   32.0%
  8 Fruit 2.1          +18  =20  -62   *0   28.0   100   28.0%
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

Elo pool : 2842

http://rebel13.nl/a/grl.htm

Ít's a bit of a gamble.
90% of coding is debugging, the other 10% is writing bugs.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by lkaufman »

Rebel wrote: Fri Sep 24, 2021 11:24 pm Elo pool : 2842

http://rebel13.nl/a/grl.htm

Ít's a bit of a gamble.
I was going to say I think you went too far, it would be tough for Dragon to approach 50%, but as I write this it is dead even!
Komodo rules!
User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

lkaufman wrote: Fri Sep 24, 2021 11:32 pm
Rebel wrote: Fri Sep 24, 2021 11:24 pm Elo pool : 2842

http://rebel13.nl/a/grl.htm

Ít's a bit of a gamble.
I was going to say I think you went too far, it would be tough for Dragon to approach 50%, but as I write this it is dead even!
Before I start a gauntlet match I always let it run for 10-15 minutes at 40/10 just to check of everything runs as it should. Then I look at the 40/10 results which gives a raw impression if the pool is balanced enough. In this case Dragon 2.5 scored 56% at 40/10 which gave me confidence knowing that a time control of 40/40 would favor the lower rated engines. Meanwhile the score for Dragon 2.5 has shank to 46%.
90% of coding is debugging, the other 10% is writing bugs.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by lkaufman »

Rebel wrote: Sat Sep 25, 2021 1:26 am
lkaufman wrote: Fri Sep 24, 2021 11:32 pm
Rebel wrote: Fri Sep 24, 2021 11:24 pm Elo pool : 2842

http://rebel13.nl/a/grl.htm

Ít's a bit of a gamble.
I was going to say I think you went too far, it would be tough for Dragon to approach 50%, but as I write this it is dead even!
Before I start a gauntlet match I always let it run for 10-15 minutes at 40/10 just to check of everything runs as it should. Then I look at the 40/10 results which gives a raw impression if the pool is balanced enough. In this case Dragon 2.5 scored 56% at 40/10 which gave me confidence knowing that a time control of 40/40 would favor the lower rated engines. Meanwhile the score for Dragon 2.5 has shank to 46%.
Actually this is a good result, its performance rating vs. the 2700-2730 field was about 2795, and as of now it is running at about 2820 against the 2842 field. So overall it is performing a bit above the 2800 level giving knight odds. Probably this would be about 2700 at "true" knight odds, without the advantage of the relatively favorable first 100 positions from the set. I would estimate that the GM level CCRL engines play about as well at your bullet time control as comparably rated (comparing CCRL Blitz ratings to human FIDE ratings) humans would at an average blitz time control like 3' + 2". So perhaps that means that Dragon 2.5 could give knight odds in blitz to human GMs rated around 2700. That's probably about right; Dragon 2 was a bit plus at about that time control against GM Alex Lenderman, who was about 2632 FIDE standard and about 2650 FIDE blitz at the time.
Komodo rules!
User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2842 elo rated engines

Dragon 2.5 - 2821

Code: Select all

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2842 elo rated engines
Time Control : Time control : 40/40
Games        : 700

Results from file all.pgn:

No. Name               Win Draw Loss Unf.  Score Games       %
--------------------------------------------------------------
  1 Komodo-Dragon 2.5 +296  =63 -341   *0  327.5   700   46.8%
  2 Asymptote 0.8      +62  =12  -26   *0   68.0   100   68.0%
  3 Velvet 2.0.0       +56   =8  -36   *0   60.0   100   60.0%
  4 Drofa 3.0.0        +53  =10  -37   *0   58.0   100   58.0%
  5 Nemo 1.01          +44  =10  -46   *0   49.0   100   49.0%
  6 Cheese 2.2         +44   =9  -47   *0   48.5   100   48.5%
  7 Olithink 5.9.8     +46   =5  -49   *0   48.5   100   48.5%
  8 Nalwald 1.14       +36   =9  -55   *0   40.5   100   40.5%

Total Games:     700
White Wins:      296 (42.3%)
Black Wins:      341 (48.7%)
Draws:            63 (9.0%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2842 pool

   # PLAYER               :  RATING  POINTS  PLAYED   (%)
   1 Asymptote 0.8        :  2953.6    68.0     100    68
   2 Velvet 2.0.0         :  2892.5    60.0     100    60
   3 Drofa 3.0.0          :  2878.0    58.0     100    58
   4 Komodo-Dragon 2.5    :  2821.5   327.5     700    47
   5 Nemo 1.01            :  2814.5    49.0     100    49
   6 Cheese 2.2           :  2811.0    48.5     100    49
   7 Olithink 5.9.8       :  2811.0    48.5     100    49
   8 Nalwald 1.14         :  2754.1    40.5     100    41
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

One thing I still want to know, what happens if we increase the time control from 40/40 to 40/120, so CCRL 40/2.

Previous Dragon 2.5 result vs elo 2715 pool = 61.4% at 40/40

http://rebel13.nl/a/grl.htm

Same run but now at 40/120
90% of coding is debugging, the other 10% is writing bugs.
User avatar
Rebel
Posts: 7459
Joined: Thu Aug 18, 2011 12:04 pm
Full name: Ed Schröder

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Rebel »

Rebel wrote: Sat Sep 25, 2021 8:40 am One thing I still want to know, what happens if we increase the time control from 40/40 to 40/120, so CCRL 40/2.

Previous Dragon 2.5 result vs elo 2715 pool = 61.4% at 40/40

http://rebel13.nl/a/grl.htm

Same run but now at 40/120
Reply to self.

It seems time control matters a lot more than I initially thought.

Code: Select all

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2715 elo rated engines
Time Control : Time control : 40/120
Games        : 700

Results from file all.pgn:

No. Name               Win Draw Loss Unf.  Score Games       %
--------------------------------------------------------------
  1 Komodo-Dragon 2.5 +324 =106 -270   *0  377.0   700   53.9%
  2 Benjamin 1.0       +48  =23  -29   *0   59.5   100   59.5%
  3 ProDeo 2.2         +45  =21  -34   *0   55.5   100   55.5%
  4 k2 099             +45  =11  -44   *0   50.5   100   50.5%
  5 Zahak 5.0          +42   =4  -54   *0   44.0   100   44.0%
  6 Dumb 1.8           +36  =15  -49   *0   43.5   100   43.5%
  7 Velvet 1.2.0       +29  =19  -52   *0   38.5   100   38.5%
  8 Fruit 2.1          +25  =13  -62   *0   31.5   100   31.5%

Total Games:     700
White Wins:      324 (46.3%)
Black Wins:      270 (38.6%)
Draws:           106 (15.1%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2715 pool

   # PLAYER               :  RATING  POINTS  PLAYED   (%)
   1 Benjamin 1.0         :  2807.0    59.5     100    60
   2 ProDeo 2.2           :  2778.3    55.5     100    56
   3 k2 099               :  2743.1    50.5     100    51
   4 Komodo-Dragon 2.5    :  2739.6   377.0     700    54
   5 Zahak 5.0            :  2697.3    44.0     100    44
   6 Dumb 1.8             :  2693.8    43.5     100    44
   7 Velvet 1.2.0         :  2657.5    38.5     100    39
   8 Fruit 2.1            :  2603.4    31.5     100    32
40/40 : 61.4%
40/120 : 53.9%


Larry, at which time control do you usually play GM's?
90% of coding is debugging, the other 10% is writing bugs.
lkaufman
Posts: 6279
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by lkaufman »

Rebel wrote: Sat Sep 25, 2021 3:52 pm
Rebel wrote: Sat Sep 25, 2021 8:40 am One thing I still want to know, what happens if we increase the time control from 40/40 to 40/120, so CCRL 40/2.

Previous Dragon 2.5 result vs elo 2715 pool = 61.4% at 40/40

http://rebel13.nl/a/grl.htm

Same run but now at 40/120
Reply to self.

It seems time control matters a lot more than I initially thought.

Code: Select all

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2715 elo rated engines
Time Control : Time control : 40/120
Games        : 700

Results from file all.pgn:

No. Name               Win Draw Loss Unf.  Score Games       %
--------------------------------------------------------------
  1 Komodo-Dragon 2.5 +324 =106 -270   *0  377.0   700   53.9%
  2 Benjamin 1.0       +48  =23  -29   *0   59.5   100   59.5%
  3 ProDeo 2.2         +45  =21  -34   *0   55.5   100   55.5%
  4 k2 099             +45  =11  -44   *0   50.5   100   50.5%
  5 Zahak 5.0          +42   =4  -54   *0   44.0   100   44.0%
  6 Dumb 1.8           +36  =15  -49   *0   43.5   100   43.5%
  7 Velvet 1.2.0       +29  =19  -52   *0   38.5   100   38.5%
  8 Fruit 2.1          +25  =13  -62   *0   31.5   100   31.5%

Total Games:     700
White Wins:      324 (46.3%)
Black Wins:      270 (38.6%)
Draws:           106 (15.1%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2715 pool

   # PLAYER               :  RATING  POINTS  PLAYED   (%)
   1 Benjamin 1.0         :  2807.0    59.5     100    60
   2 ProDeo 2.2           :  2778.3    55.5     100    56
   3 k2 099               :  2743.1    50.5     100    51
   4 Komodo-Dragon 2.5    :  2739.6   377.0     700    54
   5 Zahak 5.0            :  2697.3    44.0     100    44
   6 Dumb 1.8             :  2693.8    43.5     100    44
   7 Velvet 1.2.0         :  2657.5    38.5     100    39
   8 Fruit 2.1            :  2603.4    31.5     100    32
40/40 : 61.4%
40/120 : 53.9%


Larry, at which time control do you usually play GM's?
When we play "par" GMs at knight odds (around the 2500 FIDE rating required for the title normally) we use the now standard FIDE Rapid time control of 15' + 10" increment. When we play top GMs we use that TC for smaller handicaps, but for knight odds it would be blitz, most likely 5' + 1". What I've discovered with Dragon giving knight odds is that Dragon doesn't benefit at all from thinking longer than what it could do at 2' + 1" blitz, whereas the opponent benefits greatly. With too much time to think Dragon just sees losses everywhere and doesn't know what to do. This seems to be true against engines and against humans. So knight odds performance will really start to drop if you go from blitz out to Rapid.
Komodo rules!
Uri Blass
Posts: 11144
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Komodo-Dragon-2 vs Stockfish 14 at knight odss

Post by Uri Blass »

lkaufman wrote: Sat Sep 25, 2021 6:05 pm
Rebel wrote: Sat Sep 25, 2021 3:52 pm
Rebel wrote: Sat Sep 25, 2021 8:40 am One thing I still want to know, what happens if we increase the time control from 40/40 to 40/120, so CCRL 40/2.

Previous Dragon 2.5 result vs elo 2715 pool = 61.4% at 40/40

http://rebel13.nl/a/grl.htm

Same run but now at 40/120
Reply to self.

It seems time control matters a lot more than I initially thought.

Code: Select all

KNIGHT odds match Komodo Dragon 2.5 vs a pool of 2715 elo rated engines
Time Control : Time control : 40/120
Games        : 700

Results from file all.pgn:

No. Name               Win Draw Loss Unf.  Score Games       %
--------------------------------------------------------------
  1 Komodo-Dragon 2.5 +324 =106 -270   *0  377.0   700   53.9%
  2 Benjamin 1.0       +48  =23  -29   *0   59.5   100   59.5%
  3 ProDeo 2.2         +45  =21  -34   *0   55.5   100   55.5%
  4 k2 099             +45  =11  -44   *0   50.5   100   50.5%
  5 Zahak 5.0          +42   =4  -54   *0   44.0   100   44.0%
  6 Dumb 1.8           +36  =15  -49   *0   43.5   100   43.5%
  7 Velvet 1.2.0       +29  =19  -52   *0   38.5   100   38.5%
  8 Fruit 2.1          +25  =13  -62   *0   31.5   100   31.5%

Total Games:     700
White Wins:      324 (46.3%)
Black Wins:      270 (38.6%)
Draws:           106 (15.1%)
Unfinished:        0 (0.0%)

Estimated ratings for this elo 2715 pool

   # PLAYER               :  RATING  POINTS  PLAYED   (%)
   1 Benjamin 1.0         :  2807.0    59.5     100    60
   2 ProDeo 2.2           :  2778.3    55.5     100    56
   3 k2 099               :  2743.1    50.5     100    51
   4 Komodo-Dragon 2.5    :  2739.6   377.0     700    54
   5 Zahak 5.0            :  2697.3    44.0     100    44
   6 Dumb 1.8             :  2693.8    43.5     100    44
   7 Velvet 1.2.0         :  2657.5    38.5     100    39
   8 Fruit 2.1            :  2603.4    31.5     100    32
40/40 : 61.4%
40/120 : 53.9%


Larry, at which time control do you usually play GM's?
When we play "par" GMs at knight odds (around the 2500 FIDE rating required for the title normally) we use the now standard FIDE Rapid time control of 15' + 10" increment. When we play top GMs we use that TC for smaller handicaps, but for knight odds it would be blitz, most likely 5' + 1". What I've discovered with Dragon giving knight odds is that Dragon doesn't benefit at all from thinking longer than what it could do at 2' + 1" blitz, whereas the opponent benefits greatly. With too much time to think Dragon just sees losses everywhere and doesn't know what to do. This seems to be true against engines and against humans. So knight odds performance will really start to drop if you go from blitz out to Rapid.
Is there an interface that I can use to play different time control for 2 engines and how exactly to do it?

I remember that in the past I used arena for unequal time control matches and also winboard but I do not find how to do it in the arena that I have now and I do not see user friendly interface for winboard for this purpose and I do not like to look again how to do it there.