What the CEGT Blitz rating list would look like with Houdini

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

User avatar
Leto
Posts: 2139
Joined: Thu May 04, 2006 3:40 am
Location: Dune

What the CEGT Blitz rating list would look like with Houdini

Post by Leto »

I'm a CEGT Blitz tester and I've been wondering what the list would look like with Houdini in it, so I carried out the tests in the same manner I do for all my CEGT Blitz testing and here are the results thus far:

Fifth batch of matches completed, so after 500 games this is what it'd look like (just top 20 for easier viewing) :

Program Elo + - Games Score Av.Op. Draws

1 Rybka 4.0 x64 4CPU : 3264 14 14 1800 77.0 % 3054 33.1 %
2 Houdini 1.02 x64 4CPU : 3244 21 21 500 62.9 % 3153 51.4 %
3 Rybka 3.0 Dynamic x64 4CPU : 3235 19 19 1200 79.8 % 2997 25.9 %
4 Rybka 3.0 x64 4CPU : 3233 9 9 4350 77.6 % 3017 30.5 %
5 Rybka 3.0 Human x64 4CPU : 3231 21 20 1000 78.8 % 3003 26.8 %
6 Rybka 4.0 x64 2CPU : 3231 24 23 600 73.4 % 3054 35.2 %
7 Stockfish 1.7.1 x64 4CPU : 3200 12 12 2350 70.6 % 3049 35.0 %
8 Rybka 3.0 x64 2CPU : 3197 13 13 2512 77.0 % 2988 27.6 %
9 Rybka 4.0 x64 1CPU : 3181 11 11 2670 68.4 % 3046 38.2 %
10 Rybka 3.0 Dynamic x64 2CPU : 3178 19 19 1200 79.6 % 2942 26.2 %
11 Rybka 3.0 Human x64 2CPU : 3175 18 18 1350 80.0 % 2934 26.7 %
12 Naum 4.2 x64 4CPU : 3156 11 11 2350 65.7 % 3043 39.1 %
13 Stockfish 1.6.3 x64 4CPU : 3154 12 12 2100 68.6 % 3018 34.9 %
14 Stockfish 1.6 x64 4CPU : 3153 16 16 1100 62.0 % 3068 43.0 %
15 Stockfish 1.7.1 x64 2CPU : 3140 19 19 800 58.8 % 3078 37.9 %
16 Naum 4.1 x64 4CPU : 3138 14 14 1350 62.1 % 3052 42.3 %
17 Rybka 4.0 w32 1CPU : 3127 18 18 1000 71.0 % 2972 34.7 %
18 Naum 4.0 x64 4CPU : 3121 9 9 3400 59.8 % 3052 42.4 %
19 Rybka 3.0 x64 1CPU : 3120 9 8 4640 71.8 % 2958 33.6 %
20 Stockfish 1.7.1 w32 2CPU : 3120 16 16 1330 70.5 % 2968 33.0 %

Houdini 1.02 x64 4CPU scored 75% against Deep Shredder w32 4CPU and 78% against Deep Fritz 11 4CPU. Houdini gained 6 elo, Shredder lost 1 elo, Deep Fritz 11 lost 1 elo.

Match score summary:
1) Against Rybka 4 x64 4CPU Houdini scored 42%, and against Rybka 3 x64 4CPU Houdini scored 56%.

2) Houdini 1.02 x64 4CPU scored 58% against Stockfish 1.7.1 x64 4CPU and 64% against Naum 4.2 x64 4CPU. Houdini gained 4 elo, Stockfish lost 1 elo, and Naum lost 1 elo.

3) Houdini scored 63% against Stockfish 1.6 x64 4CPU and 64% against Naum 4.1 x64 4CPU, losing 3 elo.

4) Houdini 1.02 x64 4CPU scored 64% against Naum 4 x64 4CPU and 65% against Hiarcs 13.1 4CPU. Houdini lost 7 elo, Hiarcs gained 5 elo.

5) Houdini 1.02 x64 4CPU scored 75% against Deep Shredder w32 4CPU and 78% against Deep Fritz 11 4CPU. Houdini gained 6 elo, Shredder lost 1 elo, Deep Fritz 11 lost 1 elo.
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: What the CEGT Blitz rating list would look like with Hou

Post by Albert Silver »

Leto wrote:I'm a CEGT Blitz tester and I've been wondering what the list would look like with Houdini in it, so I carried out the tests in the same manner I do for all my CEGT Blitz testing and here are the results thus far:

Fifth batch of matches completed, so after 500 games this is what it'd look like (just top 20 for easier viewing) :

Program Elo + - Games Score Av.Op. Draws

1 Rybka 4.0 x64 4CPU : 3264 14 14 1800 77.0 % 3054 33.1 %
2 Houdini 1.02 x64 4CPU : 3244 21 21 500 62.9 % 3153 51.4 %
3 Rybka 3.0 Dynamic x64 4CPU : 3235 19 19 1200 79.8 % 2997 25.9 %
4 Rybka 3.0 x64 4CPU : 3233 9 9 4350 77.6 % 3017 30.5 %
5 Rybka 3.0 Human x64 4CPU : 3231 21 20 1000 78.8 % 3003 26.8 %
6 Rybka 4.0 x64 2CPU : 3231 24 23 600 73.4 % 3054 35.2 %
7 Stockfish 1.7.1 x64 4CPU : 3200 12 12 2350 70.6 % 3049 35.0 %
Interesting. With Houdini being the strongest of the IPPOs, and its edge over Rybka 3 in ultra-fast time controls being quite considerable, its mere 10 Elo advantage here is revealing.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Taner Altinsoy
Posts: 147
Joined: Fri Dec 18, 2009 3:56 pm
Location: Istanbul

Re: What the CEGT Blitz rating list would look like with Hou

Post by Taner Altinsoy »

500 games is not enough to come up with any conclusion. Also you can argue Rybka gained less than 30 elo after years of R&D.

Another note. Houdini 1.02 might not be the strongest of the versions. 1.01 looks to be the best of the Houdini with like 10 elo better than 1.02.

Taner
Albert Silver
Posts: 3026
Joined: Wed Mar 08, 2006 9:57 pm
Location: Rio de Janeiro, Brazil

Re: What the CEGT Blitz rating list would look like with Hou

Post by Albert Silver »

Taner Altinsoy wrote:500 games is not enough to come up with any conclusion. Also you can argue Rybka gained less than 30 elo after years of R&D.
Why would one argue that? How is it relevant to how strong Houdini is?
Another note. Houdini 1.02 might not be the strongest of the versions. 1.01 looks to be the best of the Houdini with like 10 elo better than 1.02.
Based on what data?
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."
Grizzlytae
Posts: 43
Joined: Sun Apr 02, 2006 2:31 am
Location: Ohio USA

Re: What the CEGT Blitz rating list would look like with Hou

Post by Grizzlytae »

You can also say that.. Each time Rybka get's stronger. So will the clones.
This message has been approved by GrizzlyTae Aka. The Pitbull