After 300 games including Deep Junior matches (note different number of threads):
1 Houdini 2.0c Pro x64___3300
2 Komodo64 SSE Version 4___3272
3 Critter 1.4a x64___3264
4 Rybka 4.1 SSE42 x64___3256
5 Stockfish 2.2.2 SSE42 ___3242
Minor surprise is Critter's 3. place - I predict it "scale" not so good.
Rating list from Timo's tournaments
Moderators: hgm, Rebel, chrisw
-
- Posts: 3283
- Joined: Wed Mar 08, 2006 8:15 pm
Re: Rating list from Timo's tournaments
And I think 300 long games is meaning more than 30 000 superfast one
Jouni
-
- Posts: 98
- Joined: Sun Jan 03, 2010 12:28 pm
- Location: Hamburg
Re: Rating list from Timo's tournaments
Hi Jouni,
thanks for the feedback and for the rating list!
Final results of the Titan match and game downloads on my webpage:
http://www.team-oh.de/Computerschach/Clash.htm
Best regards
Timo
thanks for the feedback and for the rating list!
Final results of the Titan match and game downloads on my webpage:
http://www.team-oh.de/Computerschach/Clash.htm
Best regards
Timo
-
- Posts: 1968
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Elo performance lists from 'Clash of the Titans' tourney.
Hello:
I downloaded the full PGN and splitted it into three new PGN files (1 thread, 2 threads and 6 threads). I did this task by hand, so it may contain errors, although I did not expect them. The lists were made with Ordo 0.4 by Ballicora, adjusting each overall average rating to 0.
Thanks to Ballicora for Ordo, and also thanks to Timo for run this great match! Congratulations.
Regards from Spain.
Ajedrecista.
I have refined a little the list posted by Jouni, splitting lists according to the number of used threads by each engine; this Elo performance list is only of Clash of the Titans tourney (not an incredible number of games for narrowing the Elo uncertainties, but due to the long TC of the tourney, it is more than enough for me):Jouni wrote:After 300 games including Deep Junior matches (note different number of threads):
1 Houdini 2.0c Pro x64___3300
2 Komodo64 SSE Version 4___3272
3 Critter 1.4a x64___3264
4 Rybka 4.1 SSE42 x64___3256
5 Stockfish 2.2.2 SSE42 ___3242
Minor surprise is Critter's 3. place - I predict it "scale" not so good.
Code: Select all
1 thread:
ENGINE: RATING POINTS PLAYED (%)
Houdini 2.0c Pro x64: 27.7 32.0 60 53.3%
Critter 1.4a SSE42 x64: 10.4 30.5 60 50.8%
Komodo64 SSE Version 4: 4.6 122.0 240 50.8%
Stockfish 2.2.2 JA SSE42 x64: -12.7 28.5 60 47.5%
Rybka 4.1 SSE42 x64: -30.1 27.0 60 45.0%
Code: Select all
2 threads:
ENGINE: RATING POINTS PLAYED (%)
Stockfish 2.2.2 JA SSE42 x64: 13.5 31.5 60 52.5%
Rybka 4.1 SSE42 x64: -3.8 59.0 120 49.2%
Critter 1.4a SSE42 x64: -9.6 29.5 60 49.2%
Code: Select all
6 threads:
ENGINE: RATING POINTS PLAYED (%)
Houdini 2.0c Pro x64: 32.0 101.0 180 56.1%
Rybka 4.1 SSE42 x64: 8.9 28.0 60 46.7%
Critter 1.4a SSE42 x64: -1.0 60.5 120 50.4%
Stockfish 2.2.2 JA SSE42 x64: -39.9 50.5 120 42.1%
Thanks to Ballicora for Ordo, and also thanks to Timo for run this great match! Congratulations.
Regards from Spain.
Ajedrecista.
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Rating list from Timo's tournaments
Interesting stuff, even if it's slightly awkward to create a rating list from a combination of 6 thread matches and 1 thread matches.Jouni wrote:After 300 games including Deep Junior matches (note different number of threads):
1 Houdini 2.0c Pro x64___3300
2 Komodo64 SSE Version 4___3272
3 Critter 1.4a x64___3264
4 Rybka 4.1 SSE42 x64___3256
5 Stockfish 2.2.2 SSE42 ___3242
Most striking is that there's no fundamental difference with the IPON list played at about 100 times slower time control, other than some compression of the ratings.
Stop poking fun at Don and Larry.Jouni wrote:Minor surprise is Critter's 3. place - I predict it "scale" not so good.
Thanks to Timo for organizing these matches, there were a lot of interesting games!
Robert
-
- Posts: 98
- Joined: Sun Jan 03, 2010 12:28 pm
- Location: Hamburg
Re: Rating list from Timo's tournaments
Hi Robert,
IPON Overall Scores:
Houdini: 327.0/600 (54.5%)
Critter: 304.5/600 (50.75%)
Komodo: 303.5/600 (50.58%)
Stockfish: 289.0/600 (48.17%)
Rybka: 276.0/600 (46.0%)
Titan Overall Scores:
Houdini: 133.0/240 (55.42%)
Komodo: 122.0/240 (50.83%)
Critter: 120.5/240 (50.21%)
Rybka: 114.0/240 (47.50%)
Stockfish: 110.5/240 (46.04%)
Very small differences (Rybka and Stockfish deviate most), all within the mathematical expectation. So this seems to be another proof that it isn't necessary to play games at long TCs to produce a reliable rating list that is valid for all types of TCs. IPON conditions seem to be fully sufficiant for that matter.
Best regards
Timo
That's true, here are both scores directly compared:Houdini wrote:Most striking is that there's no fundamental difference with the IPON list played at about 100 times slower time control, other than some compression of the ratings.
IPON Overall Scores:
Houdini: 327.0/600 (54.5%)
Critter: 304.5/600 (50.75%)
Komodo: 303.5/600 (50.58%)
Stockfish: 289.0/600 (48.17%)
Rybka: 276.0/600 (46.0%)
Titan Overall Scores:
Houdini: 133.0/240 (55.42%)
Komodo: 122.0/240 (50.83%)
Critter: 120.5/240 (50.21%)
Rybka: 114.0/240 (47.50%)
Stockfish: 110.5/240 (46.04%)
Very small differences (Rybka and Stockfish deviate most), all within the mathematical expectation. So this seems to be another proof that it isn't necessary to play games at long TCs to produce a reliable rating list that is valid for all types of TCs. IPON conditions seem to be fully sufficiant for that matter.
You are welcome, it was also a lot of fun for me thus it was a relatively high expenditure. These matches are time consuming (setup and maintain the computers and matches) and the next electricity bill will be high (~5 EURO a day when using all computers). Maybe I should also add a donate button on my homepage...Houdini wrote:Thanks to Timo for organizing these matches, there were a lot of interesting games!
Best regards
Timo
-
- Posts: 98
- Joined: Sun Jan 03, 2010 12:28 pm
- Location: Hamburg
Re: Elo performance lists from 'Clash of the Titans' tourney
Hello Jesús,
thanks for the feedback and for your interesting statistics. Of course more games would be necessary to draw any stable conclusions, but a tendancy is already visible.
Best regards from Hamburg, Germany
Timo
thanks for the feedback and for your interesting statistics. Of course more games would be necessary to draw any stable conclusions, but a tendancy is already visible.
Best regards from Hamburg, Germany
Timo
-
- Posts: 1187
- Joined: Wed Jan 06, 2010 3:11 pm
Re: Rating list from Timo's tournaments
Fully agree with you Timo. No need test at very long time control.
Besides the IPON list as you mentioned, there is also little difference with the CEGT 40/20 list and CCRL 40/40 list, although Komodo scores a little bit higher at CCRL.
grts Bram
Besides the IPON list as you mentioned, there is also little difference with the CEGT 40/20 list and CCRL 40/40 list, although Komodo scores a little bit higher at CCRL.
grts Bram