Relative strengths of engines in nTCEC Season 2

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Relative strengths of engines in nTCEC Season 2

Post by Milos »

Sorry Adam, but all these "predictions" are pure BS and just show total lack of understanding of Bayesian statistics.
Engines so far played in TCEC handful of games (at best 43). For this number of games you have error bars. More strictly speaking you can extrapolate each engine performance with a normal distribution relative mean (in Elo difference betweein engines) and variance.
From the rating lists you have another (absolute) performance in terms of absolute mean (in Elo) and variance. As the hardware (and software) used here is not the same as in the rating lists you have to increase the initaial variance (you have from rating lists).
Then you can combine these 2 distributions as a multivariate normal distribution and you calculate new expected mean and variance.

Starting from the initail estimation and than combining ratings from performance like done in FIDE ratings is just plain simple wrong.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update (through game 71 of Stage 3, Season 2):

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1121.05          : 3239.7       9.0      14   64.3%
   2 Stockfish 151013        : 3219.8       7.5      14   53.6%
   3 Houdini 9601            : 3212.9       8.5      14   60.7%
   4 Bouquet 1.8b            : 3187.4       8.0      14   57.1%
   5 Critter 1.6a            : 3159.4      20.0      37   54.1%
   6 Gull 2.3                : 3157.7       7.0      14   50.0%
   7 Rybka 4.1               : 3147.9      39.5      70   56.4%
   8 Vitruvius 1.19          : 3112.0       5.0      12   41.7%
   9 Hiarcs 14               : 3080.1      27.0      51   52.9%
  10 Naum 4.6                : 3074.8       7.0      14   50.0%
  11 Equinox 2b (8CPU)       : 3053.5      12.5      25   50.0%
  12 Shredder 12             : 3010.4      12.5      25   50.0%
  13 Hannibal 220813         : 2986.1       3.5       7   50.0%
  14 Junior 13.3             : 2982.6      17.0      40   42.5%
  15 Spike 1.4 (12CPU)       : 2974.5      12.0      24   50.0%
  16 Jonny 6                 : 2965.9      12.0      25   48.0%
  17 Spark 1                 : 2927.7       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2856.7       8.5      24   35.4%
  19 Toga II 140913          : 2849.9       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.4       3.5       7   50.0%
  21 Tornado 5               : 2838.6       4.5      18   25.0%
  22 Exchess 7.15b           : 2827.3       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2824.9       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.7       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.3       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.9       2.0      12   16.7%
  29 Octochess 5178          : 2702.4       1.5       6   25.0%
  30 Arasan 16               : 2672.9       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.7       1.0       6   16.7%
  32 Nebula 2                : 2601.4       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.1       3.5       7   50.0%
  34 Arminius 100813         : 2589.2       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%

All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
     Komodo 1092             : 3240.7      14.0      18   77.8%
   1 Komodo 1121.05          : 3239.7       9.0      14   64.3%
     Komodo 1063             : 3239.1       5.0       7   71.4%
     Stockfish 160913        : 3226.2      13.5      18   75.0%
     Stockfish 4             : 3220.8       5.0       7   71.4%
   2 Stockfish 151013        : 3219.8       7.5      14   53.6%
   3 Houdini 9601            : 3212.9       8.5      14   60.7%
     Houdini 3               : 3211.2      61.0     104   58.7%
     Bouquet 1.8             : 3189.0       5.5       7   78.6%
   4 Bouquet 1.8b            : 3187.4       8.0      14   57.1%
     Bouquet 1.8a            : 3186.5      11.5      18   63.9%
     Stockfish 250313        : 3177.7       8.0      12   66.7%
     Stockfish 250413        : 3176.5      23.0      48   47.9%
     Stockfish 120413        : 3176.4       9.5      18   52.8%
   5 Critter 1.6a            : 3159.4      20.0      37   54.1%
     Gull 2.2                : 3158.3      17.0      25   68.0%
   6 Gull 2.3                : 3157.7       7.0      14   50.0%
   7 Rybka 4.1               : 3147.9      39.5      70   56.4%
   8 Vitruvius 1.19          : 3112.0       5.0      12   41.7%
   9 Hiarcs 14               : 3080.1      27.0      51   52.9%
  10 Naum 4.6                : 3074.8       7.0      14   50.0%
     Komodo 4534 (1CPU)      : 3071.5      14.0      30   46.7%
     Naum 4.5                : 3068.3      10.0      19   52.6%
     Naum 4.2                : 3061.6       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3053.5      12.5      25   50.0%
  12 Shredder 12             : 3010.4      12.5      25   50.0%
  13 Hannibal 220813         : 2986.1       3.5       7   50.0%
  14 Junior 13.3             : 2982.6      17.0      40   42.5%
  15 Spike 1.4 (12CPU)       : 2974.5      12.0      24   50.0%
  16 Jonny 6                 : 2965.9      12.0      25   48.0%
  17 Spark 1                 : 2927.7       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2856.7       8.5      24   35.4%
     Toga II 280513          : 2855.9       3.5       7   50.0%
  19 Toga II 140913          : 2849.9       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.4       3.5       7   50.0%
     Tornado 4.88            : 2839.1       3.5       7   50.0%
  21 Tornado 5               : 2838.6       4.5      18   25.0%
  22 Exchess 7.15b           : 2827.3       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2824.9       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.7       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.3       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.9       2.0      12   16.7%
  29 Octochess 5178          : 2702.4       1.5       6   25.0%
  30 Arasan 16               : 2672.9       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.7       1.0       6   16.7%
  32 Nebula 2                : 2601.4       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.1       3.5       7   50.0%
  34 Arminius 100813         : 2589.2       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%

I made one change in the CCRL information being used for these estimates. I believe that I have been making an unwarranted assumption about the strength increase from Komodo 4534 to Komodo 1063, and so I have corrected for that.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update after the completion of Stage 3, Season 2:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1121.05          : 3243.4      11.5      18   63.9%
   2 Stockfish 151013        : 3227.0      10.0      18   55.6%
   3 Houdini 9601            : 3202.2      10.0      18   55.6%
   4 Bouquet 1.8b            : 3183.0      10.0      18   55.6%
   5 Gull 2.3                : 3162.6       9.5      18   52.8%
   6 Critter 1.6a            : 3151.1      21.5      41   52.4%
   7 Rybka 4.1               : 3147.2      41.0      73   56.2%
   8 Vitruvius 1.19          : 3111.3       5.0      12   41.7%
   9 Hiarcs 14               : 3080.3      28.5      55   51.8%
  10 Naum 4.6                : 3075.0       8.5      18   47.2%
  11 Equinox 2b (8CPU)       : 3053.7      12.5      25   50.0%
  12 Shredder 12             : 3010.5      12.5      25   50.0%
  13 Junior 13.3             : 2995.3      19.0      43   44.2%
  14 Hannibal 220813         : 2986.3       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2975.0      12.0      24   50.0%
  16 Jonny 6                 : 2966.5      12.0      25   48.0%
  17 Spark 1                 : 2928.4       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2857.1       8.5      24   35.4%
  19 Toga II 140913          : 2850.6       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.5       3.5       7   50.0%
  21 Tornado 5               : 2839.4       4.5      18   25.0%
  22 Exchess 7.15b           : 2827.8       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2825.3       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.7       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.4       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.7       2.0      12   16.7%
  29 Octochess 5178          : 2702.3       1.5       6   25.0%
  30 Arasan 16               : 2672.9       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  32 Nebula 2                : 2601.5       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.2       3.5       7   50.0%
  34 Arminius 100813         : 2589.8       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.3       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%

All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
     Komodo 1092             : 3243.6      14.0      18   77.8%
   1 Komodo 1121.05          : 3243.4      11.5      18   63.9%
     Komodo 1063             : 3241.5       5.0       7   71.4%
     Stockfish 160913        : 3232.1      13.5      18   75.0%
   2 Stockfish 151013        : 3227.0      10.0      18   55.6%
     Stockfish 4             : 3222.2       5.0       7   71.4%
     Houdini 3               : 3204.3      61.0     104   58.7%
   3 Houdini 9601            : 3202.2      10.0      18   55.6%
     Bouquet 1.8             : 3185.8       5.5       7   78.6%
   4 Bouquet 1.8b            : 3183.0      10.0      18   55.6%
     Bouquet 1.8a            : 3182.8      11.5      18   63.9%
     Stockfish 250313        : 3174.9       8.0      12   66.7%
     Stockfish 120413        : 3173.3       9.5      18   52.8%
     Stockfish 250413        : 3173.1      23.0      48   47.9%
   5 Gull 2.3                : 3162.6       9.5      18   52.8%
     Gull 2.2                : 3162.2      17.0      25   68.0%
   6 Critter 1.6a            : 3151.1      21.5      41   52.4%
   7 Rybka 4.1               : 3147.2      41.0      73   56.2%
   8 Vitruvius 1.19          : 3111.3       5.0      12   41.7%
   9 Hiarcs 14               : 3080.3      28.5      55   51.8%
  10 Naum 4.6                : 3075.0       8.5      18   47.2%
     Komodo 4534 (1CPU)      : 3071.3      14.0      30   46.7%
     Naum 4.5                : 3068.5      10.0      19   52.6%
     Naum 4.2                : 3061.8       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3053.7      12.5      25   50.0%
  12 Shredder 12             : 3010.5      12.5      25   50.0%
  13 Junior 13.3             : 2995.3      19.0      43   44.2%
  14 Hannibal 220813         : 2986.3       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2975.0      12.0      24   50.0%
  16 Jonny 6                 : 2966.5      12.0      25   48.0%
  17 Spark 1                 : 2928.4       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2857.1       8.5      24   35.4%
     Toga II 280513          : 2856.6       3.5       7   50.0%
  19 Toga II 140913          : 2850.6       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.5       3.5       7   50.0%
     Tornado 4.88            : 2839.8       3.5       7   50.0%
  21 Tornado 5               : 2839.4       4.5      18   25.0%
  22 Exchess 7.15b           : 2827.8       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2825.3       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.7       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.4       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.7       2.0      12   16.7%
  29 Octochess 5178          : 2702.3       1.5       6   25.0%
  30 Arasan 16               : 2672.9       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  32 Nebula 2                : 2601.5       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.2       3.5       7   50.0%
  34 Arminius 100813         : 2589.8       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.3       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update through game 21 of Season 2, Stage 4:


Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1133             : 3246.5       4.5       7   64.3%
   2 Stockfish 021113        : 3235.2       4.5       7   64.3%
   3 Houdini 9601            : 3202.6      13.5      25   54.0%
   4 Bouquet 1.8b            : 3181.1      13.5      25   54.0%
   5 Critter 1.6a            : 3151.3      21.5      41   52.4%
   6 Gull R600               : 3149.8       2.5       7   35.7%
   7 Rybka 4.1               : 3147.5      41.0      73   56.2%
   8 Vitruvius 1.19          : 3111.5       5.0      12   41.7%
   9 Hiarcs 14               : 3080.6      28.5      55   51.8%
  10 Naum 4.6                : 3079.9      11.0      25   44.0%
  11 Equinox 2b (8CPU)       : 3053.6      12.5      25   50.0%
  12 Shredder 12             : 3010.7      12.5      25   50.0%
  13 Junior 13.3             : 2995.6      19.0      43   44.2%
  14 Hannibal 220813         : 2986.6       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2975.1      12.0      24   50.0%
  16 Jonny 6                 : 2966.9      12.0      25   48.0%
  17 Spark 1                 : 2928.6       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2857.0       8.5      24   35.4%
  19 Toga II 140913          : 2850.8       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.6       3.5       7   50.0%
  21 Tornado 5               : 2839.7       4.5      18   25.0%
  22 Exchess 7.15b           : 2828.3       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2825.6       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.5       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.4       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.8       2.0      12   16.7%
  29 Octochess 5178          : 2702.2       1.5       6   25.0%
  30 Arasan 16               : 2672.8       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  32 Nebula 2                : 2601.5       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.2       3.5       7   50.0%
  34 Arminius 100813         : 2589.9       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.5       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%
All versions starting at Season 1, Stage 3

Code: Select all

   
   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1133             : 3246.5       4.5       7   64.3%
     Komodo 1121.05          : 3245.8      11.5      18   63.9%
     Komodo 1092             : 3245.6      14.0      18   77.8%
     Komodo 1063             : 3243.2       5.0       7   71.4%
     Stockfish 160913        : 3237.8      13.5      18   75.0%
   2 Stockfish 021113        : 3235.2       4.5       7   64.3%
     Stockfish 151013        : 3233.5      10.0      18   55.6%
     Stockfish 4             : 3225.1       5.0       7   71.4%
     Houdini 3               : 3204.8      61.0     104   58.7%
   3 Houdini 9601            : 3202.6      13.5      25   54.0%
     Bouquet 1.8             : 3184.3       5.5       7   78.6%
     Bouquet 1.8a            : 3181.2      11.5      18   63.9%
   4 Bouquet 1.8b            : 3181.1      13.5      25   54.0%
     Stockfish 250313        : 3175.7       8.0      12   66.7%
     Stockfish 120413        : 3174.2       9.5      18   52.8%
     Stockfish 250413        : 3174.2      23.0      48   47.9%
     Gull 2.2                : 3154.6      17.0      25   68.0%
     Gull 2.3                : 3153.0       9.5      18   52.8%
   5 Critter 1.6a            : 3151.3      21.5      41   52.4%
   6 Gull R600               : 3149.8       2.5       7   35.7%
   7 Rybka 4.1               : 3147.5      41.0      73   56.2%
   8 Vitruvius 1.19          : 3111.5       5.0      12   41.7%
   9 Hiarcs 14               : 3080.6      28.5      55   51.8%
  10 Naum 4.6                : 3079.9      11.0      25   44.0%
     Naum 4.5                : 3072.4      10.0      19   52.6%
     Komodo 4534 (1CPU)      : 3072.0      14.0      30   46.7%
     Naum 4.2                : 3065.2       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3053.6      12.5      25   50.0%
  12 Shredder 12             : 3010.7      12.5      25   50.0%
  13 Junior 13.3             : 2995.6      19.0      43   44.2%
  14 Hannibal 220813         : 2986.6       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2975.1      12.0      24   50.0%
  16 Jonny 6                 : 2966.9      12.0      25   48.0%
  17 Spark 1                 : 2928.6       9.5      25   38.0%
  18 Onno 1.27 (8CPU)        : 2857.0       8.5      24   35.4%
     Toga II 280513          : 2856.8       3.5       7   50.0%
  19 Toga II 140913          : 2850.8       3.5      18   19.4%
  20 Minkochess 1.3          : 2843.6       3.5       7   50.0%
     Tornado 4.88            : 2840.1       3.5       7   50.0%
  21 Tornado 5               : 2839.7       4.5      18   25.0%
  22 Exchess 7.15b           : 2828.3       6.5      24   27.1%
  23 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  24 The Baron 3.35a         : 2825.6       3.5       7   50.0%
  25 Gaviota 0.87a8          : 2807.5       3.5       7   50.0%
  26 Scorpio 2.76            : 2800.6       3.0       7   42.9%
  27 Crafty 23.6             : 2790.4       3.0       7   42.9%
  28 Quazar 0.4 (1CPU)       : 2772.8       2.0      12   16.7%
  29 Octochess 5178          : 2702.2       1.5       6   25.0%
  30 Arasan 16               : 2672.8       1.5       6   25.0%
  31 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  32 Nebula 2                : 2601.5       1.0       6   16.7%
  33 Hamsters 0.71 (8CPU)    : 2593.2       3.5       7   50.0%
  34 Arminius 100813         : 2589.9       2.0       7   28.6%
  35 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  36 Delphil 3 (8CPU)        : 2504.5       2.0       6   33.3%
  37 Firefly 2.6             : 2170.3       0.0       6    0.0%
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update through game 40 of Season 2, Stage 4:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   
   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3264.9      10.5      14   75.0%
   2 Komodo 1133             : 3256.3       8.5      13   65.4%
   3 Houdini 9601            : 3204.5      17.0      31   54.8%
   4 Bouquet 1.8b            : 3172.3      15.5      31   50.0%
   5 Critter 1.6a            : 3156.8      22.5      42   53.6%
   6 Rybka 4.1               : 3150.0      41.0      73   56.2%
   7 Gull R600               : 3141.0       4.0      13   30.8%
   8 Vitruvius 1.19          : 3112.3       5.0      12   41.7%
   9 Naum 4.6                : 3083.2      13.0      32   40.6%
  10 Hiarcs 14               : 3083.0      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3055.6      12.5      25   50.0%
  12 Shredder 12             : 3013.3      12.5      25   50.0%
  13 Junior 13.3             : 2998.2      19.0      43   44.2%
  14 Hannibal 220813         : 2987.5       3.5       7   50.0%
  15 Spike 1.4               : 2983.9       7.5      18   41.7%
  16 Jonny 6                 : 2970.3      12.0      25   48.0%
  17 Spike 1.4 (12CPU)       : 2965.3       4.5       7   64.3%
  18 Spark 1                 : 2930.7       9.5      25   38.0%
  19 Onno 1.27               : 2891.8       5.5      18   30.6%
  20 Toga II 140913          : 2854.0       3.5      18   19.4%
  21 Minkochess 1.3          : 2842.7       3.5       7   50.0%
  22 Tornado 5               : 2842.2       4.5      18   25.0%
  23 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  24 Exchess 7.15b           : 2833.7       7.0      25   28.0%
  25 The Baron 3.35a         : 2827.0       3.5       7   50.0%
  26 Sjeng WC2008 (8CPU)     : 2825.7       3.5       7   50.0%
  27 Gaviota 0.87a8          : 2807.6       3.5       7   50.0%
  28 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  29 Crafty 23.6             : 2790.5       3.0       7   42.9%
  30 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  31 Octochess 5178          : 2702.8       1.5       6   25.0%
  32 Arasan 16               : 2672.9       1.5       6   25.0%
  33 Redqueen 1.14           : 2617.7       1.0       6   16.7%
  34 Nebula 2                : 2601.6       1.0       6   16.7%
  35 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  36 Arminius 100813         : 2590.2       2.0       7   28.6%
  37 Alfil 13.1 (8CPU)       : 2576.6       2.5       7   35.7%
  38 Delphil 3 (8CPU)        : 2504.0       2.0       6   33.3%
  39 Firefly 2.6             : 2170.2       0.0       6    0.0

All versions starting at Season 1, Stage 3

Code: Select all

   
   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3264.9      10.5      14   75.0%
     Stockfish 160913        : 3259.0      13.5      18   75.0%
     Stockfish 151013        : 3257.8      10.0      18   55.6%
   2 Komodo 1133             : 3256.3       8.5      13   65.4%
     Komodo 1121.05          : 3254.1      11.5      18   63.9%
     Komodo 1092             : 3252.9      14.0      18   77.8%
     Komodo 1063             : 3249.7       5.0       7   71.4%
     Stockfish 4             : 3236.6       5.0       7   71.4%
     Houdini 3               : 3206.2      60.0     103   58.3%
   3 Houdini 9601            : 3204.5      17.0      31   54.8%
     Stockfish 250313        : 3178.9       8.0      12   66.7%
     Stockfish 250413        : 3178.0      23.0      48   47.9%
     Bouquet 1.8             : 3178.0       5.5       7   78.6%
     Stockfish 120413        : 3177.7       9.5      18   52.8%
     Bouquet 1.8a            : 3174.3      11.5      18   63.9%
   4 Bouquet 1.8b            : 3172.3      15.5      31   50.0%
   5 Critter 1.6a            : 3156.8      22.5      42   53.6%
     Gull 2.2                : 3150.3      17.0      25   68.0%
   6 Rybka 4.1               : 3150.0      41.0      73   56.2%
     Gull 2.3                : 3147.0       9.5      18   52.8%
   7 Gull R600               : 3141.0       4.0      13   30.8%
   8 Vitruvius 1.19          : 3112.3       5.0      12   41.7%
   9 Naum 4.6                : 3083.2      13.0      32   40.6%
  10 Hiarcs 14               : 3083.0      28.5      55   51.8%
     Naum 4.5                : 3076.8      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3074.6      14.0      30   46.7%
     Naum 4.2                : 3069.0       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3055.6      12.5      25   50.0%
  12 Shredder 12             : 3013.3      12.5      25   50.0%
  13 Junior 13.3             : 2998.2      19.0      43   44.2%
  14 Hannibal 220813         : 2987.5       3.5       7   50.0%
  15 Spike 1.4               : 2983.9       7.5      18   41.7%
  16 Jonny 6                 : 2970.3      12.0      25   48.0%
  17 Spike 1.4 (12CPU)       : 2965.3       4.5       7   64.3%
  18 Spark 1                 : 2930.7       9.5      25   38.0%
  19 Onno 1.27               : 2891.8       5.5      18   30.6%
     Toga II 280513          : 2859.7       3.5       7   50.0%
  20 Toga II 140913          : 2854.0       3.5      18   19.4%
  21 Minkochess 1.3          : 2842.7       3.5       7   50.0%
     Tornado 4.88            : 2842.4       3.5       7   50.0%
  22 Tornado 5               : 2842.2       4.5      18   25.0%
  23 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  24 Exchess 7.15b           : 2833.7       7.0      25   28.0%
  25 The Baron 3.35a         : 2827.0       3.5       7   50.0%
  26 Sjeng WC2008 (8CPU)     : 2825.7       3.5       7   50.0%
  27 Gaviota 0.87a8          : 2807.6       3.5       7   50.0%
  28 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  29 Crafty 23.6             : 2790.5       3.0       7   42.9%
  30 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  31 Octochess 5178          : 2702.8       1.5       6   25.0%
  32 Arasan 16               : 2672.9       1.5       6   25.0%
  33 Redqueen 1.14           : 2617.7       1.0       6   16.7%
  34 Nebula 2                : 2601.6       1.0       6   16.7%
  35 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  36 Arminius 100813         : 2590.2       2.0       7   28.6%
  37 Alfil 13.1 (8CPU)       : 2576.6       2.5       7   35.7%
  38 Delphil 3 (8CPU)        : 2504.0       2.0       6   33.3%
  39 Firefly 2.6             : 2170.2       0.0       6    0.0%
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update through game 60 of Season 2, Stage 4:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3256.8      13.5      20   67.5%
   2 Komodo 1133             : 3252.8      12.5      20   62.5%
   3 Houdini 9601            : 3206.4      21.0      38   55.3%
   4 Bouquet 1.8b            : 3180.4      19.5      38   51.3%
   5 Critter 1.6a            : 3156.6      22.5      42   53.6%
   6 Rybka 4.1               : 3149.9      41.0      73   56.2%
   7 Gull R600               : 3132.8       6.5      20   32.5%
   8 Vitruvius 1.19          : 3112.3       5.0      12   41.7%
   9 Naum 4.6                : 3085.2      15.5      38   40.8%
  10 Hiarcs 14               : 3082.9      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3055.7      12.5      25   50.0%
  12 Shredder 12             : 3013.3      12.5      25   50.0%
  13 Junior 13.3             : 2998.2      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4               : 2983.7       7.5      18   41.7%
  16 Jonny 6                 : 2970.3      12.0      25   48.0%
  17 Spike 1.4 (12CPU)       : 2965.5       4.5       7   64.3%
  18 Spark 1                 : 2930.7       9.5      25   38.0%
  19 Onno 1.27               : 2891.9       5.5      18   30.6%
  20 Toga II 140913          : 2854.0       3.5      18   19.4%
  21 Minkochess 1.3          : 2842.8       3.5       7   50.0%
  22 Tornado 5               : 2842.3       4.5      18   25.0%
  23 Onno 1.27 (8CPU)        : 2836.2       3.5       7   50.0%
  24 Exchess 7.15b           : 2833.8       7.0      25   28.0%
  25 The Baron 3.35a         : 2827.3       3.5       7   50.0%
  26 Sjeng WC2008 (8CPU)     : 2825.7       3.5       7   50.0%
  27 Gaviota 0.87a8          : 2807.4       3.5       7   50.0%
  28 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  29 Crafty 23.6             : 2790.7       3.0       7   42.9%
  30 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  31 Octochess 5178          : 2702.6       1.5       6   25.0%
  32 Arasan 16               : 2672.8       1.5       6   25.0%
  33 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  34 Nebula 2                : 2601.6       1.0       6   16.7%
  35 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  36 Arminius 100813         : 2590.3       2.0       7   28.6%
  37 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  38 Delphil 3 (8CPU)        : 2504.2       2.0       6   33.3%
  39 Firefly 2.6             : 2170.2       0.0       6    0.0%
All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3256.8      13.5      20   67.5%
     Stockfish 160913        : 3253.5      13.5      18   75.0%
   2 Komodo 1133             : 3252.8      12.5      20   62.5%
     Stockfish 151013        : 3251.4      10.0      18   55.6%
     Komodo 1121.05          : 3251.3      11.5      18   63.9%
     Komodo 1092             : 3250.6      14.0      18   77.8%
     Komodo 1063             : 3247.6       5.0       7   71.4%
     Stockfish 4             : 3234.0       5.0       7   71.4%
     Houdini 3               : 3207.2      60.0     103   58.3%
   3 Houdini 9601            : 3206.4      21.0      38   55.3%
     Bouquet 1.8             : 3184.0       5.5       7   78.6%
     Bouquet 1.8a            : 3181.0      11.5      18   63.9%
   4 Bouquet 1.8b            : 3180.4      19.5      38   51.3%
     Stockfish 250313        : 3178.8       8.0      12   66.7%
     Stockfish 250413        : 3177.9      23.0      48   47.9%
     Stockfish 120413        : 3177.6       9.5      18   52.8%
   5 Critter 1.6a            : 3156.6      22.5      42   53.6%
   6 Rybka 4.1               : 3149.9      41.0      73   56.2%
     Gull 2.2                : 3145.4      17.0      25   68.0%
     Gull 2.3                : 3140.9       9.5      18   52.8%
   7 Gull R600               : 3132.8       6.5      20   32.5%
   8 Vitruvius 1.19          : 3112.3       5.0      12   41.7%
   9 Naum 4.6                : 3085.2      15.5      38   40.8%
  10 Hiarcs 14               : 3082.9      28.5      55   51.8%
     Naum 4.5                : 3078.3      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3074.1      14.0      30   46.7%
     Naum 4.2                : 3070.3       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3055.7      12.5      25   50.0%
  12 Shredder 12             : 3013.3      12.5      25   50.0%
  13 Junior 13.3             : 2998.2      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4               : 2983.7       7.5      18   41.7%
  16 Jonny 6                 : 2970.3      12.0      25   48.0%
  17 Spike 1.4 (12CPU)       : 2965.5       4.5       7   64.3%
  18 Spark 1                 : 2930.7       9.5      25   38.0%
  19 Onno 1.27               : 2891.9       5.5      18   30.6%
     Toga II 280513          : 2859.7       3.5       7   50.0%
  20 Toga II 140913          : 2854.0       3.5      18   19.4%
  21 Minkochess 1.3          : 2842.8       3.5       7   50.0%
     Tornado 4.88            : 2842.5       3.5       7   50.0%
  22 Tornado 5               : 2842.3       4.5      18   25.0%
  23 Onno 1.27 (8CPU)        : 2836.2       3.5       7   50.0%
  24 Exchess 7.15b           : 2833.8       7.0      25   28.0%
  25 The Baron 3.35a         : 2827.3       3.5       7   50.0%
  26 Sjeng WC2008 (8CPU)     : 2825.7       3.5       7   50.0%
  27 Gaviota 0.87a8          : 2807.4       3.5       7   50.0%
  28 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  29 Crafty 23.6             : 2790.7       3.0       7   42.9%
  30 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  31 Octochess 5178          : 2702.6       1.5       6   25.0%
  32 Arasan 16               : 2672.8       1.5       6   25.0%
  33 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  34 Nebula 2                : 2601.6       1.0       6   16.7%
  35 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  36 Arminius 100813         : 2590.3       2.0       7   28.6%
  37 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  38 Delphil 3 (8CPU)        : 2504.2       2.0       6   33.3%
  39 Firefly 2.6             : 2170.2       0.0       6    0.0%
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update at end of Stage 4 of Season 2:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3265.7      20.5      30   68.3%
   2 Komodo 1133             : 3246.3      18.0      30   60.0%
   3 Houdini 9601            : 3210.6      27.0      48   56.2%
   4 Bouquet 1.8b            : 3171.7      23.5      48   49.0%
   5 Critter 1.6a            : 3156.0      22.5      42   53.6%
   6 Rybka 4.1               : 3149.9      41.0      73   56.2%
   7 Gull R600               : 3130.8      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.5      19.0      48   39.6%
  10 Hiarcs 14               : 3082.6      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.8      12.5      25   50.0%
  13 Junior 13.3             : 2997.6      19.0      43   44.2%
  14 Hannibal 220813         : 2987.3       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.0      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.3       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2833.0       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.8       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.4       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.5       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  30 Octochess 5178          : 2702.7       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.0       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%

All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Stockfish 021113        : 3265.7      20.5      30   68.3%
     Stockfish 160913        : 3259.6      13.5      18   75.0%
     Stockfish 151013        : 3258.4      10.0      18   55.6%
     Komodo 1092             : 3246.4      14.0      18   77.8%
   2 Komodo 1133             : 3246.3      18.0      30   60.0%
     Komodo 1121.05          : 3246.3      11.5      18   63.9%
     Komodo 1063             : 3244.0       5.0       7   71.4%
     Stockfish 4             : 3237.7       5.0       7   71.4%
   3 Houdini 9601            : 3210.6      27.0      48   56.2%
     Houdini 3               : 3210.0      60.0     103   58.3%
     Stockfish 250313        : 3180.6       8.0      12   66.7%
     Stockfish 250413        : 3180.3      23.0      48   47.9%
     Stockfish 120413        : 3179.6       9.5      18   52.8%
     Bouquet 1.8             : 3177.4       5.5       7   78.6%
     Bouquet 1.8a            : 3173.6      11.5      18   63.9%
   4 Bouquet 1.8b            : 3171.7      23.5      48   49.0%
   5 Critter 1.6a            : 3156.0      22.5      42   53.6%
   6 Rybka 4.1               : 3149.9      41.0      73   56.2%
     Gull 2.2                : 3144.0      17.0      25   68.0%
     Gull 2.3                : 3139.2       9.5      18   52.8%
   7 Gull R600               : 3130.8      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.5      19.0      48   39.6%
  10 Hiarcs 14               : 3082.6      28.5      55   51.8%
     Naum 4.5                : 3078.5      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3073.4      14.0      30   46.7%
     Naum 4.2                : 3070.4       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.8      12.5      25   50.0%
  13 Junior 13.3             : 2997.6      19.0      43   44.2%
  14 Hannibal 220813         : 2987.3       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.0      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.3       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
     Toga II 280513          : 2858.9       3.5       7   50.0%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
     Tornado 4.88            : 2841.9       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2833.0       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.8       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2826.0       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.4       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.5       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.1       2.0      12   16.7%
  30 Octochess 5178          : 2702.7       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.8       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.5       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.0       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Relative strengths of engines in nTCEC Season 2

Post by Adam Hair »

Update after game 27 of Superfinal:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
     Komodo 1133             : 3259.1      18.0      30   60.0%
     Komodo 1121.05          : 3256.1      11.5      18   63.9%
     Komodo 1092             : 3254.4      14.0      18   77.8%
     Stockfish 021113        : 3251.1      20.5      30   68.3%
     Komodo 1063             : 3251.0       5.0       7   71.4%
     Stockfish 160913        : 3249.4      13.5      18   75.0%
     Stockfish 151013        : 3246.6      10.0      18   55.6%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
     Stockfish 4             : 3232.3       5.0       7   71.4%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
     Houdini 3               : 3209.3      60.0     103   58.3%
     Stockfish 250313        : 3179.5       8.0      12   66.7%
     Stockfish 250413        : 3178.7      23.0      48   47.9%
     Stockfish 120413        : 3178.3       9.5      18   52.8%
     Bouquet 1.8             : 3176.9       5.5       7   78.6%
     Bouquet 1.8a            : 3173.1      11.5      18   63.9%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
     Gull 2.2                : 3143.7      17.0      25   68.0%
     Gull 2.3                : 3138.9       9.5      18   52.8%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
     Naum 4.5                : 3078.2      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3075.4      14.0      30   46.7%
     Naum 4.2                : 3070.1       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
     Toga II 280513          : 2858.9       3.5       7   50.0%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
     Tornado 4.88            : 2841.9       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Relative strengths of engines in nTCEC Season 2

Post by michiguel »

Adam Hair wrote:Update after game 27 of Superfinal:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
     Komodo 1133             : 3259.1      18.0      30   60.0%
     Komodo 1121.05          : 3256.1      11.5      18   63.9%
     Komodo 1092             : 3254.4      14.0      18   77.8%
     Stockfish 021113        : 3251.1      20.5      30   68.3%
     Komodo 1063             : 3251.0       5.0       7   71.4%
     Stockfish 160913        : 3249.4      13.5      18   75.0%
     Stockfish 151013        : 3246.6      10.0      18   55.6%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
     Stockfish 4             : 3232.3       5.0       7   71.4%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
     Houdini 3               : 3209.3      60.0     103   58.3%
     Stockfish 250313        : 3179.5       8.0      12   66.7%
     Stockfish 250413        : 3178.7      23.0      48   47.9%
     Stockfish 120413        : 3178.3       9.5      18   52.8%
     Bouquet 1.8             : 3176.9       5.5       7   78.6%
     Bouquet 1.8a            : 3173.1      11.5      18   63.9%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
     Gull 2.2                : 3143.7      17.0      25   68.0%
     Gull 2.3                : 3138.9       9.5      18   52.8%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
     Naum 4.5                : 3078.2      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3075.4      14.0      30   46.7%
     Naum 4.2                : 3070.1       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
     Toga II 280513          : 2858.9       3.5       7   50.0%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
     Tornado 4.88            : 2841.9       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
At this point, your approach (more influence from previous CCRL data) and the control approach below (less influence) seem to be converging to similar relative relationships among engines (at least with the engines with most games). That is expected, but nice to see, I think.

Miguel

Code: Select all

   # PLAYER              : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142         :   3211      16.0      28   57.1%
   2 Stockfish 191113    :   3184      12.0      28   42.9%
   3 Houdini 9601        :   3165      27.0      48   56.2%
   4 Bouquet 1.8b        :   3122      23.5      48   49.0%
   5 Rybka 4.1           :   3109      41.0      73   56.2%
   6 Critter 1.6a        :   3103      22.5      42   53.6%
   7 Gull R600           :   3082      10.5      30   35.0%
   8 Hiarcs 14           :   3041      28.5      55   51.8%
   9 Naum 4.6            :   3039      19.0      48   39.6%
  10 Equinox 2b          :   3027      12.5      25   50.0%
  11 Vitruvius 1.19      :   3001       5.0      12   41.7%
  12 Shredder 12         :   2975      12.5      25   50.0%
  13 Hannibal 220813     :   2965       3.5       7   50.0%
  14 Junior 13.3         :   2959      19.0      43   44.2%
  15 Spike 1.4           :   2939      12.0      25   48.0%
  16 Jonny 6             :   2936      12.0      25   48.0%
  17 Spark 1             :   2892       9.5      25   38.0%
  18 Minkochess 1.3      :   2846       3.5       7   50.0%
  19 Onno 1.27           :   2830       9.0      25   36.0%
  20 Toga II 140913      :   2819       3.5      18   19.4%
  21 Tornado 5           :   2813       4.5      18   25.0%
  22 Sjeng WC2008        :   2810       3.5       7   50.0%
  23 Exchess 7.15b       :   2808       7.0      25   28.0%
  24 Quazar 0.4          :   2803       2.0      12   16.7%
  25 Gaviota 0.87a8      :   2786       3.5       7   50.0%
  26 Scorpio 2.76        :   2782       3.0       7   42.9%
  27 The Baron 3.35a     :   2758       3.5       7   50.0%
  28 Crafty 23.6         :   2753       3.0       7   42.9%
  29 Octochess 5178      :   2683       1.5       6   25.0%
  30 Arasan 16           :   2618       1.5       6   25.0%
  31 Redqueen 1.14       :   2614       1.0       6   16.7%
  32 Hamsters 0.71       :   2594       3.5       7   50.0%
  33 Nebula 2            :   2590       1.0       6   16.7%
  34 Alfil 13.1          :   2580       2.5       7   35.7%
  35 Arminius 100813     :   2569       2.0       7   28.6%
  36 Delphil 3           :   2439       2.0       6   33.3%
  37 Firefly 2.6         :   2170       0.0       6    0.0%
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Relative strengths of engines in nTCEC Season 2

Post by Milos »

Adam Hair wrote:Update after game 27 of Superfinal:

Current Versions (engines using less than 16 cores are labeled)

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
All versions starting at Season 1, Stage 3

Code: Select all

   # PLAYER                  : RATING    POINTS  PLAYED    (%)
   1 Komodo 1142             : 3265.0      15.5      27   57.4%
     Komodo 1133             : 3259.1      18.0      30   60.0%
     Komodo 1121.05          : 3256.1      11.5      18   63.9%
     Komodo 1092             : 3254.4      14.0      18   77.8%
     Stockfish 021113        : 3251.1      20.5      30   68.3%
     Komodo 1063             : 3251.0       5.0       7   71.4%
     Stockfish 160913        : 3249.4      13.5      18   75.0%
     Stockfish 151013        : 3246.6      10.0      18   55.6%
   2 Stockfish 191113        : 3245.2      11.5      27   42.6%
     Stockfish 4             : 3232.3       5.0       7   71.4%
   3 Houdini 9601            : 3210.0      27.0      48   56.2%
     Houdini 3               : 3209.3      60.0     103   58.3%
     Stockfish 250313        : 3179.5       8.0      12   66.7%
     Stockfish 250413        : 3178.7      23.0      48   47.9%
     Stockfish 120413        : 3178.3       9.5      18   52.8%
     Bouquet 1.8             : 3176.9       5.5       7   78.6%
     Bouquet 1.8a            : 3173.1      11.5      18   63.9%
   4 Bouquet 1.8b            : 3171.2      23.5      48   49.0%
   5 Critter 1.6a            : 3155.9      22.5      42   53.6%
   6 Rybka 4.1               : 3149.8      41.0      73   56.2%
     Gull 2.2                : 3143.7      17.0      25   68.0%
     Gull 2.3                : 3138.9       9.5      18   52.8%
   7 Gull R600               : 3130.5      10.5      30   35.0%
   8 Vitruvius 1.19          : 3112.5       5.0      12   41.7%
   9 Naum 4.6                : 3085.1      19.0      48   39.6%
  10 Hiarcs 14               : 3082.4      28.5      55   51.8%
     Naum 4.5                : 3078.2      10.0      18   55.6%
     Komodo 4534 (1CPU)      : 3075.4      14.0      30   46.7%
     Naum 4.2                : 3070.1       4.5       7   64.3%
  11 Equinox 2b (8CPU)       : 3054.8      12.5      25   50.0%
  12 Shredder 12             : 3012.6      12.5      25   50.0%
  13 Junior 13.3             : 2997.3      19.0      43   44.2%
  14 Hannibal 220813         : 2987.4       3.5       7   50.0%
  15 Spike 1.4 (12CPU)       : 2972.1      12.0      25   48.0%
  16 Jonny 6                 : 2969.2      12.0      25   48.0%
  17 Spark 1                 : 2930.2       9.5      25   38.0%
  18 Onno 1.27               : 2890.3       5.5      18   30.6%
     Toga II 280513          : 2858.9       3.5       7   50.0%
  19 Toga II 140913          : 2853.2       3.5      18   19.4%
  20 Minkochess 1.3          : 2842.8       3.5       7   50.0%
     Tornado 4.88            : 2841.9       3.5       7   50.0%
  21 Tornado 5               : 2841.7       4.5      18   25.0%
  22 Onno 1.27 (8CPU)        : 2836.4       3.5       7   50.0%
  23 Exchess 7.15b           : 2832.9       7.0      25   28.0%
  24 The Baron 3.35a         : 2826.9       3.5       7   50.0%
  25 Sjeng WC2008 (8CPU)     : 2825.9       3.5       7   50.0%
  26 Gaviota 0.87a8          : 2807.3       3.5       7   50.0%
  27 Scorpio 2.76            : 2800.7       3.0       7   42.9%
  28 Crafty 23.6             : 2790.6       3.0       7   42.9%
  29 Quazar 0.4 (1CPU)       : 2773.2       2.0      12   16.7%
  30 Octochess 5178          : 2702.5       1.5       6   25.0%
  31 Arasan 16               : 2672.8       1.5       6   25.0%
  32 Redqueen 1.14           : 2617.9       1.0       6   16.7%
  33 Nebula 2                : 2601.6       1.0       6   16.7%
  34 Hamsters 0.71 (8CPU)    : 2593.3       3.5       7   50.0%
  35 Arminius 100813         : 2590.1       2.0       7   28.6%
  36 Alfil 13.1 (8CPU)       : 2576.4       2.5       7   35.7%
  37 Delphil 3 (8CPU)        : 2504.1       2.0       6   33.3%
  38 Firefly 2.6             : 2170.2       0.0       6    0.0%
You didn't post results after game 17 of super final but the result would be pretty close to the one at the end of Stage 4, so +20 Elo for Stockfish. Then after 10 more games and 4 defeats by SF the difference is now +20 for Komodo. So in 10 games SF lost 40 Elo, i.e. 10 games in your "simulations" are worth 40 Elo.
What a joke man, this is no better than those comedians on TCEC judging the engine strength based on 3 previous games.
You methodology is just ridiculous, and whatever calculation formula from Miguel you are using it is so wrong that it can't be more wrong.
But I don't blame you, this is the consequence of chemistry guy suddenly becoming an expert in statistics... :lol: