NEBB-Rankinglists: Komodo 4

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
pohl4711
Posts: 2816
Joined: Sat Sep 03, 2011 7:25 am
Location: Berlin, Germany
Full name: Stefan Pohl

NEBB-Rankinglists: Komodo 4

Post by pohl4711 »

The NEBB-Rankingslists (Naked Engine Bullet and Blitz) now with Komodo 4:

Intel Q9550 2.83GHz Quad (no SSE support), LittleBlitzerGUI, 256 MB Hash, 1 Core per Engine, no ponder, no bases, no resign. 50 super-short test-positions (1.a3 a6, 1.a3 b6, 1.a3 c6…..1.h3 g6, 1.h3 h6) = Naked Engines (no openings (book or long test-positions (Noomen etc.)), no endgame-databases) – only engine-thinking from move 2 until mate or draw.
Two lists with exact same conditions except the thinking time. That makes it possible to see, which engine scores better or worse with more or less thinking time...

Blitzlist (4’+2’’)

Code: Select all

Rank Name                       Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64          3102   19   19   700   61%  3034   38% 
   2 Houdini 1.5a x64          3099   19   19   700   60%  3034   40% (best freeware)
   3 Komodo 4 x64              3072   19   19   700   54%  3048   42% (singlecore)
   4 Critter 1.2 64-bit        3054   17   17   800   51%  3048   45% 
   5 Ivanhoe B46fa x64         3040   17   17   800   49%  3050   51% 
   6 Komodo 3 x64              3030   19   19   700   47%  3048   46% (singlecore) 
   7 Rybka 4.1 x64             3026   17   17   800   46%  3051   47% 
   8 RobboLito 0.09 x64        3015   17   17   800   44%  3053   50% (singlecore) 
   9 Stockfish 2.1.1 JA 64bit  3000   18   18   800   42%  3055   41%

Bulletlist (1’+500 ms)

Code: Select all

Rank Name                       Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64          3123   20   19   700   64%  3031   35% 
   2 Houdini 1.5a x64          3098   19   19   700   60%  3031   36% (best freeware) 
   3 Critter 1.2 64-bit        3067   18   17   800   53%  3047   43% 
   4 Komodo 4 x64              3055   19   19   700   51%  3052   38% (singlecore) 
   5 Ivanhoe B46fa x64         3044   17   17   800   49%  3049   47% 
   6 Komodo 3 x64              3023   19   19   700   46%  3052   37% (singlecore) 
   7 Rybka 4.1 x64             3021   18   18   800   45%  3052   39% 
   8 RobboLito 0.09 x64        3009   17   18   800   43%  3054   45% (singlecore) 
   9 Stockfish 2.1.1 JA 64bit  3000   18   18   800   42%  3055   38%
Greetings – Stefan
MM
Posts: 766
Joined: Sun Oct 16, 2011 11:25 am

Re: NEBB-Rankinglists: Komodo 4

Post by MM »

pohl4711 wrote:The NEBB-Rankingslists (Naked Engine Bullet and Blitz) now with Komodo 4:

Intel Q9550 2.83GHz Quad (no SSE support), LittleBlitzerGUI, 256 MB Hash, 1 Core per Engine, no ponder, no bases, no resign. 50 super-short test-positions (1.a3 a6, 1.a3 b6, 1.a3 c6…..1.h3 g6, 1.h3 h6) = Naked Engines (no openings (book or long test-positions (Noomen etc.)), no endgame-databases) – only engine-thinking from move 2 until mate or draw.
Two lists with exact same conditions except the thinking time. That makes it possible to see, which engine scores better or worse with more or less thinking time...

Blitzlist (4’+2’’)

Code: Select all

Rank Name                       Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64          3102   19   19   700   61%  3034   38% 
   2 Houdini 1.5a x64          3099   19   19   700   60%  3034   40% (best freeware)
   3 Komodo 4 x64              3072   19   19   700   54%  3048   42% (singlecore)
   4 Critter 1.2 64-bit        3054   17   17   800   51%  3048   45% 
   5 Ivanhoe B46fa x64         3040   17   17   800   49%  3050   51% 
   6 Komodo 3 x64              3030   19   19   700   47%  3048   46% (singlecore) 
   7 Rybka 4.1 x64             3026   17   17   800   46%  3051   47% 
   8 RobboLito 0.09 x64        3015   17   17   800   44%  3053   50% (singlecore) 
   9 Stockfish 2.1.1 JA 64bit  3000   18   18   800   42%  3055   41%

Bulletlist (1’+500 ms)

Code: Select all

Rank Name                       Elo    +    - games score oppo. draws 
   1 Houdini 2.0c x64          3123   20   19   700   64%  3031   35% 
   2 Houdini 1.5a x64          3098   19   19   700   60%  3031   36% (best freeware) 
   3 Critter 1.2 64-bit        3067   18   17   800   53%  3047   43% 
   4 Komodo 4 x64              3055   19   19   700   51%  3052   38% (singlecore) 
   5 Ivanhoe B46fa x64         3044   17   17   800   49%  3049   47% 
   6 Komodo 3 x64              3023   19   19   700   46%  3052   37% (singlecore) 
   7 Rybka 4.1 x64             3021   18   18   800   45%  3052   39% 
   8 RobboLito 0.09 x64        3009   17   18   800   43%  3054   45% (singlecore) 
   9 Stockfish 2.1.1 JA 64bit  3000   18   18   800   42%  3055   38%
Greetings – Stefan
I like your conditions, especially 1 only move of book. i have a very similar way to test engines. Go on. I believe it is a good way.

Thanks

Regards
MM
Uri Blass
Posts: 10908
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: NEBB-Rankinglists: Komodo 4

Post by Uri Blass »

I do not see difference between different time control
I get the following table for the difference in rating between 4+2 and 1+0.5

1.Komodo 4 x64 +17
2.Komodo 3 x64 +7
3.RobboLito 0.09 x64 +6
4.Rybka 4.1 x64 +5
5.Houdini 1.5a x64 +1
6.Stockfish 2.1.1 JA 64bit +0
7.Ivanhoe B46fa x64 -4
8. Critter 1.2 64-bit -13
9.Houdini 2.0c x64 -21

When the possible error in a rating is 17-19 elo
the differences are clearly below the statistical error and I cannot reject the conjecture that there is no difference.

I believe that there is a difference but you clearly need many thousands of games for every program to prove it(700-900 games for a program are not enough and if you can use 500 short opening for every program and not only 50 maybe we can learn something about the program that earn from long time control).
lkaufman
Posts: 6259
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: NEBB-Rankinglists: Komodo 4

Post by lkaufman »

Uri Blass wrote:I do not see difference between different time control
I get the following table for the difference in rating between 4+2 and 1+0.5

1.Komodo 4 x64 +17
2.Komodo 3 x64 +7
3.RobboLito 0.09 x64 +6
4.Rybka 4.1 x64 +5
5.Houdini 1.5a x64 +1
6.Stockfish 2.1.1 JA 64bit +0
7.Ivanhoe B46fa x64 -4
8. Critter 1.2 64-bit -13
9.Houdini 2.0c x64 -21

When the possible error in a rating is 17-19 elo
the differences are clearly below the statistical error and I cannot reject the conjecture that there is no difference.

I believe that there is a difference but you clearly need many thousands of games for every program to prove it(700-900 games for a program are not enough and if you can use 500 short opening for every program and not only 50 maybe we can learn something about the program that earn from long time control).
The Komodos average +12, Stockfish and Rybka average +2.5, and the Ippo-related programs (I include Critter as it uses almost all Ippo ideas) -6. I think this confirms, though not conclusively, that Ippo-related programs are better at bullet chess but scale worse than the rest of us.
Uri Blass
Posts: 10908
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: NEBB-Rankinglists: Komodo 4

Post by Uri Blass »

lkaufman wrote:
Uri Blass wrote:I do not see difference between different time control
I get the following table for the difference in rating between 4+2 and 1+0.5

1.Komodo 4 x64 +17
2.Komodo 3 x64 +7
3.RobboLito 0.09 x64 +6
4.Rybka 4.1 x64 +5
5.Houdini 1.5a x64 +1
6.Stockfish 2.1.1 JA 64bit +0
7.Ivanhoe B46fa x64 -4
8. Critter 1.2 64-bit -13
9.Houdini 2.0c x64 -21

When the possible error in a rating is 17-19 elo
the differences are clearly below the statistical error and I cannot reject the conjecture that there is no difference.

I believe that there is a difference but you clearly need many thousands of games for every program to prove it(700-900 games for a program are not enough and if you can use 500 short opening for every program and not only 50 maybe we can learn something about the program that earn from long time control).
The Komodos average +12, Stockfish and Rybka average +2.5, and the Ippo-related programs (I include Critter as it uses almost all Ippo ideas) -6. I think this confirms, though not conclusively, that Ippo-related programs are better at bullet chess but scale worse than the rest of us.
I think that you cannot get correct conclusions about slower time control from comparison between 1+0.5 and 4+2

I suspect that Komodo4 scales worse than Komodo3 based on the
CCRL 40/40 list when Komodo3 has higher rating and komodo4 already played some hundrends of games because even if Komodo4 is 10 elo better than Komodo3 at 40/40 it is worse than what we could expect

Note that I am not sure about it and it is also possible that we have with enough games at least 20 elo advantage for Komodo4 but the picture does not seem good for Komodo4 at long time controls.

http://computerchess.org.uk/ccrl/4040.l ... +opponents