The Champions 2012 4CPU

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

Carotino
Posts: 222
Joined: Fri Jun 11, 2010 10:40 am
Location: Italy

Re: Round 21

Post by Carotino »

S.Taylor wrote:
Leto wrote:
S.Taylor wrote:
Graham Banks wrote:
S.Taylor wrote:I just now noticed this whole tournament.

Is there possible a reason why Houdini 3 is taking it easy, and not showing fully convincing results in this tournament?

I had thought that it often DOES, against these opponents.

Is there something wrong in its settings? in its power due to something technical? Or is it weaker in shorter times, like these games are played?
Houdini 3 64-bit 4CPU is a legitimate copy and is using the correct default settings.

These things can happen in one off tournaments, as people should be well aware by now.

That's why rating lists with hundreds or thousands of games give a more accurate indication of the relative strengths of engines.

All games of this tournament are available for scrutiny.
OK, so it's a genuine one. That wasn't something i was doubting.
But, it seems to be in a completely different mood this time. Almost every game, no matter who it is against, is doing a favour to be a draw, sometimes not even that. And if it wins, once in a while, it looks like it was by accident.
While the chance of Houdini 3 performing this poorly at the start of the tournament is very low it is still a statistical probability. In all likelihood Houdini 3 will pull ahead shortly and win this tournament with a comfortable lead thanks to the long format.
That would be reassuring to those who want to feel secure that they know which is the strongest program, and use it for that reason.

But the way it looks is like it might even have 2 more losses, one win (perhaps) and the rest draws.

(with rybka and critter getting straight wins, or maybe with 2 draws)

But on the other hand, i may see it differently if i actually watched the games.

OK, so it is statistically possible. But it shows there are still enough areas that need to be strengthened in its playing.
I believe that this tournament is showing that the differences in power between the various top-engines, are smaller than some would have us believe.
As Graham said, to have the rankings statistically reliable, it takes thousands of games, but this tournament (and other tournaments) is showing that the gap between the various engines is not that great.

Another interesting point: the old Rybka is proving to have much more to say.
This shows that this "old clone" was not so bad!
:wink:
Roberto
Lavir
Posts: 263
Joined: Sun Oct 28, 2012 11:45 am

Re: Round 21

Post by Lavir »

S.Taylor wrote: OK, so it is statistically possible. But it shows there are still enough areas that need to be strengthened in its playing.
While multi CPU games reach more depth and are usually of a greater quality for what it concern chess in itself, tournaments with SMP are not surely the best way to test for pure engine strength in itself, if not with much more games or at more long time controls (where enough depth counteract the drawbacks), and even then I personally still prefer to have permanent brain ON and 1/2 CPU than 4 CPU, for example.

The problem is the SMP randomness that can make an engine chose a subpar move when there are many choices to be had. With 1 CPU this happens much less. Or the engine can somewhat reach enough depth to discover the subpar choice, or even a single one can sometimes change the course of a game or have an engine go towards a positon that offers nothing.

In short: the more the threads, the more the "luck" factor increases. Naturally with enough games things compensate, and things equalize, but with SMP there's a larger variability in the results.
User avatar
Leto
Posts: 2071
Joined: Thu May 04, 2006 3:40 am
Location: Dune

Re: Round 21

Post by Leto »

Lavir wrote:
S.Taylor wrote: OK, so it is statistically possible. But it shows there are still enough areas that need to be strengthened in its playing.
While multi CPU games reach more depth and are usually of a greater quality for what it concern chess in itself, tournaments with SMP are not surely the best way to test for pure engine strength in itself, if not with much more games or at more long time controls (where enough depth counteract the drawbacks), and even then I personally still prefer to have permanent brain ON and 1/2 CPU than 4 CPU, for example.

The problem is the SMP randomness that can make an engine chose a subpar move when there are many choices to be had. With 1 CPU this happens much less. Or the engine can somewhat reach enough depth to discover the subpar choice, or even a single one can sometimes change the course of a game or have an engine go towards a positon that offers nothing.

In short: the more the threads, the more the "luck" factor increases. Naturally with enough games things compensate, and things equalize, but with SMP there's a larger variability in the results.
I would think the SMP versions reduce the percentage of subpar moves being played seeing as how they are playing better chess.
User avatar
Graham Banks
Posts: 44663
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Round 23

Post by Graham Banks »

THE CHAMPIONS 2012 4CPU

Xeon X5430x2 Octal
ChessGUI
1024mb hash
3-4-5 piece tablebases
Ponder off
WorldClass2012-2.cgb book (limited to 8 move depth)
40 moves in 29 minutes repeating (adapted for the CCRL)
All engines 64-bit 4CPU where available
2 cycles 30 rounds


Round 23

Gull II b2 64-bit 4CPU v IvanHoe 9.46h 64-bit 4CPU (draw)
Vitruvius 1.11C 64-bit 4CPU v Komodo 5 64-bit (0-1)
Sting SF 2 64-bit 4CPU v Strelka 5.5 64-bit (1-0)
Equinox 1.60 64-bit 4CPU v Bouquet 1.5 64-bit 4CPU (draw)
Stockfish 2.3.1 64-bit 4CPU v DeepSaros 3.1a 64-bit 4CPU (1-0)
Rybka 4.1 64-bit 4CPU v Chiron 1.5 64-bit 4CPU (1-0)
Critter 1.6a 64-bit 4CPU v Naum 4.2 64-bit 4CPU (1-0)
Houdini 3 64-bit 4CPU v Hiarcs 14 4CPU (1-0)


Standings after Round 23

16.5 - Critter 1.6a 64-bit 4CPU
15.5 - Rybka 4.1 64-bit 4CPU
14.0 - Vitruvius 1.11C 64-bit 4CPU
14.0 - Sting SF 2 64-bit 4CPU
14.0 - Houdini 3 64-bit 4CPU
13.5 - Strelka 5.5 64-bit
11.5 - Bouquet 1.5 64-bit 4CPU
11.0 - Hiarcs 14 4CPU
11.0 - Stockfish 2.3.1 64-bit 4CPU
11.0 - Equinox 1.60 64-bit 4CPU
10.0 - IvanHoe 9.46h 64-bit 4CPU
10.0 - Komodo 5 64-bit
9.5 - DeepSaros 3.1a 64-bit 4CPU
8.5 - Chiron 1.5 64-bit 4CPU
8.0 - Naum 4.2 64-bit 4CPU
6.0 - Gull II b2 64-bit 4CPU


Round 23 PGN - http://kirill-kryukov.com/chess/discuss ... p?id=28049

If you install TLCV (Tom's Live Chess Viewer) on your computer, you can watch the games live move by move. You'll also be able to chat to others following the tournament in the chatroom there.
http://home.pacific.net.au/~tommyinoz/client.zip
Host - GrahamCCRL.dyndns.org Port - 16083

There is also a live broadcast in Playchess.
gbanksnz at gmail.com
Lavir
Posts: 263
Joined: Sun Oct 28, 2012 11:45 am

Re: Round 21

Post by Lavir »

Leto wrote: I would think the SMP versions reduce the percentage of subpar moves being played seeing as how they are playing better chess.
Again, it depends. SMP introduces many other variables that are not tied to pure engine strength (that's also tied to the style of play of the same).

One simple example, coming from the Houdini 3 vs Deep Sjeng Cluster:
[d] 4r1k1/2p5/1pN4p/p1p1b1pq/P3Pp2/8/Q4P1B/6RK b - - 0 34

Kf8! is winning, however Deep Sjeng here played Kh7, coming to a draw.

Now, if you analyze this position with H3 with 1 CPU, you can run the test a thousands times and it will find Kf8 in a matter of 1-2 seconds. You run the test with SMP and if you are unlucky the move will not be found in neither 5 minutes and with much more depth.

The same happened in the game for Deep Sjeng. After the game was ended in a draw the moves were analyzed by Suj (that holds the cluster) and Kf8 was found immediately by Deep Sjeng on this second run.

Also look at this post:
http://talkchess.com/forum/viewtopic.php?t=46062

That's how SMP randomization works, and sometimes even a subpar choice can turn a win game to a draw, or a draw to a loss.
User avatar
Leto
Posts: 2071
Joined: Thu May 04, 2006 3:40 am
Location: Dune

Re: Round 21

Post by Leto »

Lavir wrote:
Leto wrote: I would think the SMP versions reduce the percentage of subpar moves being played seeing as how they are playing better chess.
Again, it depends. SMP introduces many other variables that are not tied to pure engine strength (that's also tied to the style of play of the same).

One simple example, coming from the Houdini 3 vs Deep Sjeng Cluster:
[d] 4r1k1/2p5/1pN4p/p1p1b1pq/P3Pp2/8/Q4P1B/6RK b - - 0 34

Kf8! is winning, however Deep Sjeng here played Kh7, coming to a draw.

Now, if you analyze this position with H3 with 1 CPU, you can run the test a thousands times and it will find Kf8 in a matter of 1-2 seconds. You run the test with SMP and if you are unlucky the move will not be found in neither 5 minutes and with much more depth.

The same happened in the game for Deep Sjeng. After the game was ended in a draw the moves were analyzed by Suj (that holds the cluster) and Kf8 was found immediately by Deep Sjeng on this second run.

Also look at this post:
http://talkchess.com/forum/viewtopic.php?t=46062

That's how SMP randomization works, and sometimes even a subpar choice can turn a win game to a draw, or a draw to a loss.
Yes I've seen it myself but I still think that SMP engines reduce the percentage of subpar moves being played.
Lavir
Posts: 263
Joined: Sun Oct 28, 2012 11:45 am

Re: Round 21

Post by Lavir »

Leto wrote: Yes I've seen it myself but I still think that SMP engines reduce the percentage of subpar moves being played.
It can either be, on average, but SMP introduces an increased randomness in the results (for the motives I've explained), and this is to be taken in consideration.
S.Taylor
Posts: 8514
Joined: Thu Mar 09, 2006 3:25 am
Location: Jerusalem Israel

Re: The Champions 2012 4CPU

Post by S.Taylor »

Graham Banks wrote:THE CHAMPIONS 2012 4CPU

Xeon X5430x2 Octal
ChessGUI
1024mb hash
3-4-5 piece tablebases
Ponder off
WorldClass2012-2.cgb book (limited to 8 move depth)
40 moves in 29 minutes repeating (adapted for the CCRL)
All engines 64-bit 4CPU where available
2 cycles 30 rounds


Participants

Houdini 3 64-bit 4CPU
Critter 1.6a 64-bit 4CPU
Rybka 4.1 64-bit 4CPU
Stockfish 2.3.1 64-bit 4CPU
Equinox 1.60 64-bit 4CPU (private)
Sting SF 2 64-bit 4CPU
Vitruvius 1.11C 64-bit 4CPU
Ivanhoe 9.46h 64-bit 4CPU
Komodo 5 64-bit
Strelka 5.5 64-bit
Bouquet 1.5 64-bit 4CPU
DeepSaros 3.1a 64-bit 4CPU
Chiron 1.5 64-bit 4CPU
Naum 4.2 64-bit 4CPU
Hiarcs 14 4CPU
Gull II b2 64-bit 4CPU
A trournament similar to this one, but with longer time control (or atleast 3-4 times more power instead) would be interesting, to see contrast in the engines, at longer time control.
ernest
Posts: 2053
Joined: Wed Mar 08, 2006 8:30 pm

Re: The Champions 2012 4CPU

Post by ernest »

S.Taylor wrote:A trournament similar to this one, but with longer time control (or atleast 3-4 times more power instead) would be interesting
Just do it! :twisted:
User avatar
Graham Banks
Posts: 44663
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Round 24

Post by Graham Banks »

THE CHAMPIONS 2012 4CPU

Xeon X5430x2 Octal
ChessGUI
1024mb hash
3-4-5 piece tablebases
Ponder off
WorldClass2012-2.cgb book (limited to 8 move depth)
40 moves in 29 minutes repeating (adapted for the CCRL)
All engines 64-bit 4CPU where available
2 cycles 30 rounds


Round 24

Gull II b2 64-bit 4CPU v Komodo 5 64-bit (draw)
IvanHoe 9.46h 64-bit 4CPU v Strelka 5.5 64-bit (draw)
Vitruvius 1.11C 64-bit 4CPU v Bouquet 1.5 64-bit 4CPU (draw)
Sting SF 2 64-bit 4CPU v DeepSaros 3.1a 64-bit 4CPU (draw)
Equinox 1.60 64-bit 4CPU v Chiron 1.5 64-bit 4CPU (1-0)
Stockfish 2.3.1 64-bit 4CPU v Naum 4.2 64-bit 4CPU (draw)
Rybka 4.1 64-bit 4CPU v Hiarcs 14 4CPU (draw)
Critter 1.6a 64-bit 4CPU v Houdini 3 64-bit 4CPU (0-1)


Standings after Round 24

16.5 - Critter 1.6a 64-bit 4CPU
16.0 - Rybka 4.1 64-bit 4CPU
15.0 - Houdini 3 64-bit 4CPU
14.5 - Vitruvius 1.11C 64-bit 4CPU
14.5 - Sting SF 2 64-bit 4CPU
14.0 - Strelka 5.5 64-bit
12.0 - Bouquet 1.5 64-bit 4CPU
12.0 - Equinox 1.60 64-bit 4CPU
11.5 - Hiarcs 14 4CPU
11.5 - Stockfish 2.3.1 64-bit 4CPU
10.5 - IvanHoe 9.46h 64-bit 4CPU
10.5 - Komodo 5 64-bit
10.0 - DeepSaros 3.1a 64-bit 4CPU
8.5 - Naum 4.2 64-bit 4CPU
8.5 - Chiron 1.5 64-bit 4CPU
6.5 - Gull II b2 64-bit 4CPU


Round 24 PGN - http://kirill-kryukov.com/chess/discuss ... p?id=28053

If you install TLCV (Tom's Live Chess Viewer) on your computer, you can watch the games live move by move. You'll also be able to chat to others following the tournament in the chatroom there.
http://home.pacific.net.au/~tommyinoz/client.zip
Host - GrahamCCRL.dyndns.org Port - 16083

There is also a live broadcast in Playchess.
gbanksnz at gmail.com