The Champions 2012 4CPU

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Round 30 and Final Standings

Post by S.Taylor »

But i like to see it. (My personal taste!)
I like to see the difference in how it plays, even if it is less than 1000 games (or however many).

I DO see it in the results of games 10-30.

But from games 1-9, it looks surprisingly low. (If i watched the actual games, i might have seen it differently. But judging by results alone, not).

How many did it lose? 10.5!
If the first 10 games were all losses, and the eleventh game was a draw, and the remaining games were wins, would i like such an engine which can play so badly in 10 consecutive games, to use for analysis?
OK, but i wasn't claiming it was not the highest rated engine, if tested over 1000 games.

But am i obsessed? yes! That's why i am following chess computer news so many years now.
IGarcia
Posts: 543
Joined: Mon Jul 05, 2010 10:27 pm

Re: Round 30 and Final Standings

Post by IGarcia »

Graham Banks wrote:THE CHAMPIONS 2012 4CPU

Xeon X5430x2 Octal
ChessGUI
1024mb hash
3-4-5 piece tablebases
Ponder off
WorldClass2012-2.cgb book (limited to 8 move depth)
40 moves in 29 minutes repeating (adapted for the CCRL)
All engines 64-bit 4CPU where available
2 cycles 30 rounds


Round 30

Gull II b2 64-bit 4CPU v Hiarcs 14 4CPU (draw)
Naum 4.2 64-bit 4CPU v Houdini 3 64-bit 4CPU (0-1)
Chiron 1.5 64-bit 4CPU v Critter 1.6a 64-bit 4CPU (draw)
DeepSaros 3.1a 64-bit 4CPU v Rybka 4.1 64-bit 4CPU (0-1)
Bouquet 1.5 64-bit 4CPU v Stockfish 2.3.1 64-bit 4CPU (draw)
Strelka 5.5 64-bit v Equinox 1.60 64-bit 4CPU (draw)
Komodo 5 64-bit v Sting SF 2 64-bit 4CPU (draw)
IvanHoe 9.46h 64-bit 4CPU v Vitruvius 1.11C 64-bit 4CPU (draw)


Final Standings

20.0 - Critter 1.6a 64-bit 4CPU
20.0 - Rybka 4.1 64-bit 4CPU
19.5 - Houdini 3 64-bit 4CPU
18.0 - Sting SF 2 64-bit 4CPU
18.0 - Strelka 5.5 64-bit
17.0 - Vitruvius 1.11C 64-bit 4CPU
16.0 - Bouquet 1.5 64-bit 4CPU
15.0 - Equinox 1.60 64-bit 4CPU
14.5 - IvanHoe 9.46h 64-bit 4CPU
14.0 - Komodo 5 64-bit
13.5 - Hiarcs 14 4CPU
13.5 - Stockfish 2.3.1 64-bit 4CPU
11.5 - DeepSaros 3.1a 64-bit 4CPU
11.0 - Chiron 1.5 64-bit 4CPU
10.0 - Naum 4.2 64-bit 4CPU
8.5 - Gull II b2 64-bit 4CPU


Round 30 PGN - http://kirill-kryukov.com/chess/discuss ... p?id=28077

Complete tournament PGN (zipped) - http://kirill-kryukov.com/chess/discuss ... p?id=28076
Crosstable (made with ScidvsPC 4.8)

Thanks for tournament!

Image

Uploaded with ImageShack.us
lech
Posts: 1169
Joined: Sun Feb 14, 2010 10:02 pm

Re: Round 30 and Final Standings

Post by lech »

S.Taylor wrote:If you count only the last 21 games, Houdini would have had a resounding victory.
if we cut the table only to first 8 engines (result >= 50% = 15 points), we received:

Code: Select all

1. Sting +4
2.Critter +2
3-4. Rybka, Vitruvius +1
5. Houdini 0
6-7. Strelka. Bouquet -2
8. Equinox -4
It means also nothing.
It is a single 2 cycle tournament and the result of many games can be accidential (other than rating).
Maybe, I can't be friendly, but let me be useful.
User avatar
velmarin
Posts: 1600
Joined: Mon Feb 21, 2011 9:48 am

Re: Round 30 and Final Standings

Post by velmarin »

Tournament has been a strong, very strong.
Graham already has accustomed us, putting their time, electricity, hardware and knowledge.
Thanks.

Curiosities,
Negative, or Gull Did not get a win.
Bouquet, King's tie game, 24 no less.
against only two defeats by Critter how bright victor.

Strelka six wins in the last three. And well over Komodo, rival single core.

And sure there are many more.
Thanks, Graham.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Round 30 and Final Standings

Post by Laskos »

S.Taylor wrote:But i like to see it. (My personal taste!)
I like to see the difference in how it plays, even if it is less than 1000 games (or however many).

I DO see it in the results of games 10-30.

But from games 1-9, it looks surprisingly low. (If i watched the actual games, i might have seen it differently. But judging by results alone, not).

How many did it lose? 10.5!
If the first 10 games were all losses, and the eleventh game was a draw, and the remaining games were wins, would i like such an engine which can play so badly in 10 consecutive games, to use for analysis?
OK, but i wasn't claiming it was not the highest rated engine, if tested over 1000 games.

But am i obsessed? yes! That's why i am following chess computer news so many years now.
In the fist 9 games Houdini got 3.5 points out of 9 games, and was expected to get 5.5. Big deal? Someone like you really needs the playground with engines separated by 500 Elo points, so even a draw is rare there, and an unexpected loss is a freak event.

Kai
lech
Posts: 1169
Joined: Sun Feb 14, 2010 10:02 pm

Re: Round 30 and Final Standings

Post by lech »

velmarin wrote:Tournament has been a strong, very strong.
Graham already has accustomed us, putting their time, electricity, hardware and knowledge.
Well said!
Thanks Graham. :D
Maybe, I can't be friendly, but let me be useful.
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: Round 30 and Final Standings

Post by S.Taylor »

Laskos wrote:
S.Taylor wrote:But i like to see it. (My personal taste!)
I like to see the difference in how it plays, even if it is less than 1000 games (or however many).

I DO see it in the results of games 10-30.

But from games 1-9, it looks surprisingly low. (If i watched the actual games, i might have seen it differently. But judging by results alone, not).

How many did it lose? 10.5!
If the first 10 games were all losses, and the eleventh game was a draw, and the remaining games were wins, would i like such an engine which can play so badly in 10 consecutive games, to use for analysis?
OK, but i wasn't claiming it was not the highest rated engine, if tested over 1000 games.

But am i obsessed? yes! That's why i am following chess computer news so many years now.
In the fist 9 games Houdini got 3.5 points out of 9 games, and was expected to get 5.5. Big deal? Someone like you really needs the playground with engines separated by 500 Elo points, so even a draw is rare there, and an unexpected loss is a freak event.

Kai
Yes. That's me. So i am used to waiting for years, and only buy occasionaly. But now I'm intending to buy another 2 or 3, as i have a new computer now, and, i am much happier with the program playing strength in general, now. I can rely on them to see many more things than years ago.
Lavir
Posts: 263
Joined: Sun Oct 28, 2012 11:45 am

Re: Round 30 and Final Standings

Post by Lavir »

Laskos wrote: In the fist 9 games Houdini got 3.5 points out of 9 games, and was expected to get 5.5. Big deal? Someone like you really needs the playground with engines separated by 500 Elo points, so even a draw is rare there, and an unexpected loss is a freak event.

Kai
Those results (that are all well within statistical occurrences) are even more likely to happen given the wideness of the book, as I've explained in a previous post.

Since the type of opening plays a big role in chess, it just take a somewhat subpar choice to have a bad result indipendently from everything else. The more wide a book is, the more subpar (in all aspects, not necessarily a "bad" line, but just a little passive or not conform to the style of the engine etc.) lines it will obviously have.

So there's nothing uncommon here. 70 elo of difference means a lot of losses and draws that can happen, and a wide book means even more randomness given by the lines.
User avatar
Ajedrecista
Posts: 2128
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Round 30 and final standings.

Post by Ajedrecista »

Hello Graham!
Graham Banks wrote:THE CHAMPIONS 2012 4CPU

Xeon X5430x2 Octal
ChessGUI
1024mb hash
3-4-5 piece tablebases
Ponder off
WorldClass2012-2.cgb book (limited to 8 move depth)
40 moves in 29 minutes repeating (adapted for the CCRL)
All engines 64-bit 4CPU where available
2 cycles 30 rounds


Round 30

Gull II b2 64-bit 4CPU v Hiarcs 14 4CPU (draw)
Naum 4.2 64-bit 4CPU v Houdini 3 64-bit 4CPU (0-1)
Chiron 1.5 64-bit 4CPU v Critter 1.6a 64-bit 4CPU (draw)
DeepSaros 3.1a 64-bit 4CPU v Rybka 4.1 64-bit 4CPU (0-1)
Bouquet 1.5 64-bit 4CPU v Stockfish 2.3.1 64-bit 4CPU (draw)
Strelka 5.5 64-bit v Equinox 1.60 64-bit 4CPU (draw)
Komodo 5 64-bit v Sting SF 2 64-bit 4CPU (draw)
IvanHoe 9.46h 64-bit 4CPU v Vitruvius 1.11C 64-bit 4CPU (draw)


Final Standings

20.0 - Critter 1.6a 64-bit 4CPU
20.0 - Rybka 4.1 64-bit 4CPU
19.5 - Houdini 3 64-bit 4CPU
18.0 - Sting SF 2 64-bit 4CPU
18.0 - Strelka 5.5 64-bit
17.0 - Vitruvius 1.11C 64-bit 4CPU
16.0 - Bouquet 1.5 64-bit 4CPU
15.0 - Equinox 1.60 64-bit 4CPU
14.5 - IvanHoe 9.46h 64-bit 4CPU
14.0 - Komodo 5 64-bit
13.5 - Hiarcs 14 4CPU
13.5 - Stockfish 2.3.1 64-bit 4CPU
11.5 - DeepSaros 3.1a 64-bit 4CPU
11.0 - Chiron 1.5 64-bit 4CPU
10.0 - Naum 4.2 64-bit 4CPU
8.5 - Gull II b2 64-bit 4CPU


Round 30 PGN - http://kirill-kryukov.com/chess/discuss ... p?id=28077

Complete tournament PGN (zipped) - http://kirill-kryukov.com/chess/discuss ... p?id=28076
Thanks again for all your matches: they are very amazing.

Here is a rating list based in this match only using my own Fortran programme (which does not read PGN files) and the EloSTAT algorithm in BayesElo:

Code: Select all

Round Robin with 16 engines and     30 games per engine.
Total number of games:       240 games.
 
 3112.65 (engine 01).
 3112.65 (engine 02).
 3100.64 (engine 03).
 3066.01 (engine 04).
 3066.01 (engine 05).
 3043.77 (engine 06).
 3021.92 (engine 07).
 3000.27 (engine 08).
 2989.46 (engine 09).
 2978.62 (engine 10).
 2967.73 (engine 11).
 2967.73 (engine 12).
 2923.19 (engine 13).
 2911.65 (engine 14).
 2887.89 (engine 15).
 2849.81 (engine 16).
 
Mean of ratings:  3000.00 Elo.

Code: Select all

Rank Name                            Elo     Diff     +     -      Games  Score   Oppo.   Draws     Win          W-L-D
   1 Critter 1.6a 64-bit 4CPU    3113.16     0.00  92.80  78.53       30  66.67%  2992.46  53.33%  40.00%        12-2-16
   2 Rybka 4.1 64-bit 4CPU       3113.16    -0.00 102.43  85.27       30  66.67%  2992.46  46.67%  43.33%        13-3-14
   3 Houdini 3 64-bit 4CPU       3101.09   -12.07  97.08  82.87       30  65.00%  2993.26  50.00%  40.00%        12-3-15
   4 Sting SF 2 64-bit 4CPU      3066.31   -34.78  82.88  75.55       30  60.00%  2995.58  60.00%  30.00%         9-3-18
   5 Strelka 5.5 64-bit          3066.31    -0.00  82.88  75.55       30  60.00%  2995.58  60.00%  30.00%         9-3-18
   6 Vitruvius 1.11C 64-bit 4CPU 3043.96   -22.34  89.59  83.72       30  56.67%  2997.07  53.33%  30.00%         9-5-16
   7 Bouquet 1.5 64-bit 4CPU     3022.02   -21.94  56.30  55.10       30  53.33%  2998.53  80.00%  13.33%         4-2-24
   8 Equinox 1.60 64-bit 4CPU    3000.27   -21.75  86.69  86.69       30  50.00%  2999.98  53.33%  23.33%         7-7-16
   9 IvanHoe 9.46h 64-bit 4CPU   2989.41   -10.86  68.48  69.40       30  48.33%  3000.71  70.00%  13.33%         4-5-21
  10 Komodo 5 64-bit             2978.52   -10.89  78.69  81.19       30  46.67%  3001.43  60.00%  16.67%         5-7-18
  11 Hiarcs 14 4CPU              2967.59   -10.93  81.28  85.35       30  45.00%  3002.16  56.67%  16.67%         5-8-17
  12 Stockfish 2.3.1 64-bit 4CPU 2967.59    -0.00  67.16  69.90       30  45.00%  3002.16  70.00%  10.00%         3-6-21
  13 DeepSaros 3.1a 64-bit 4CPU  2922.84   -44.75  78.05  87.38       30  38.33%  3005.14  56.67%  10.00%         3-10-17
  14 Chiron 1.5 64-bit 4CPU      2911.26   -11.58  86.82 100.54       30  36.67%  3005.92  46.67%  13.33%         4-12-14
  15 Naum 4.2 64-bit 4CPU        2887.38   -23.88  78.53  92.80       30  33.33%  3007.51  53.33%   6.67%         2-12-16
  16 Gull II b2 64-bit 4CPU      2849.14   -38.25  70.24  85.46       30  28.33%  3010.06  56.67%   0.00%         0-13-17
Differences are almost negligible!

Code: Select all

Max.(EloSTAT) - min.(EloSTAT) ~ 3113.16 - 2849.14 = 264.02 Elo.
Max.(my programme) - min.(my programme) ~ 3112.65 - 2849.81 = 262.84 Elo.

264.02/262.84 ~ 1.0045; 262.84/264.02 ~ 0.9955
Differences are less than 0.5%! It is due to the high draw ratio: lower draw ratios would bring differences of almost 1%, which I still consider very low knowing my simple, minimal algorithm (if it can be named algorithm).

I stay tuned for the next 8 CPU Swiss tournament!

Regards from Spain.

Ajedrecista.
Carlos Ylich
Posts: 175
Joined: Wed Apr 28, 2010 9:31 pm
Location: Brazil

Re: Round 29

Post by Carlos Ylich »

Graham You are great! I am your fan.
Thanks for the great work.
honestly;
Carlos Ylich :D :!:
Remember Sabra and Chatila