CCRL update (8th February 2008)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Graham Banks
Posts: 41432
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

CCRL update (8th February 2008)

Post by Graham Banks »

The February 8th update of the CCRL Rating Lists and Statistics is now available for viewing at:
http://www.computerchess.org.uk/ccrl/4040/

The list gets updated periodically during the week and these updates can be viewed here:
http://www.computerchess.org.uk/ccrl/4040.live/
Please be aware that no game downloads are available from this live link.

The links to the various rating lists can be found just beneath the default Best Versions list.
For example there is a 32-bit Single CPU list.

Our standard testing is at 40 moves in 40 minutes repeating while our current blitz testing is at both 40 moves in 4 minutes repeating and 40 moves in 12 minutes repeating, all adjusted to the AMD64 X2 4600+ (2.4GHz).

Currently active testers in our team are:
Graham Banks, Ray Banks, Shaun Brewer, Kirill Kryukov, Dom Leste, Tom Logan, Charles Smith, George Speight, Chris Taylor, Chuck Wilson, Gabor Szots and Martin Thoresen.


40/40 Notes

There are currently 103,108 games in our 40/40 database.

Many engines on our list have few games and in many cases their ratings are likely to fluctuate (markedly for some) until a lot more games are played. Therefore no conclusions should be drawn about their strength yet.
To illustrate this point, when an engine has 200 games played, the error margin is still approximately +-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
This of course highlights the importance of looking at other rating lists that are also available in order to draw comparisons and get a more accurate overall picture.


4CPU 64-bit Engines

Rybka 2.3.2a is over 50+ ELO stronger than Zappa Mexico.
It will be interesting to see whether the just released Zappa Mexico II can close the gap further.

Deep Shredder 11 lies 40+ points further back in third spot.

Naum 2.2 comes in fourth, not too far behind Deep Shredder 11, but ahead of Deep Fritz 10.1, Toga II 1.4 beta5c and Hiarcs 11.1.
We are eagerly looking forward to testing Naum 3, which by all accounts should be a good improvement!

The remaining well tested engines in order of rating are Loop M1-T, Glaurung 2.0.1, Deep Junior 10, Bright 0.2c, Deep Sjeng 2.7 and Scorpio 2.0.
We are still in the early stages of testing Bright 0.3a.


2CPU Engines

With the emphasis of our multi-cpu testing on 4CPU as opposed to 2CPU, there are gaps in this category and some of the engines also require further games.
However, the order of strength is almost identical to the 4CPU list with regard to the engines that have been well tested.


Single CPU Engines

Rybka 2.3.2a has a massive 120+ ELO lead over Fritz 11, Shredder 11 and Zappa Mexico.
We have just started our testing of Zappa Mexico II.

30 ELO further back, Naum 2.2, Toga II 1.3.1 and Hiarcs 11.1 have a slight edge over Loop 13.6 and Fruit 2.3.1.
We are still in the early stages of testing both Toga II 1.4 beta5c and Toga II 3.1.2SE.

There is a gap of 20 ELO back to the next group of engines - Deep Sjeng 2.7, Spike 1.2 Turin, Glaurung 2.0.1 and Junior 10.

40 ELO lower still are Ktulu 8.0, Chess Tiger 2007.1, SmarThink 1.00 and Bright 0.2c.

Chessmaster 11, Movei 00.8.438 (10 10 10), Alaric 707, Booot 4.14.0 and Frenzee Dec07 comprise the next group of engines ahead of SlowChess Blitz WV2.1, E.T Chess 13.01.08, Delfi 5.2, Ruffian 2.1.0, WildCat 7, Pro Deo 1.6b and Gandalf 6.

We are still in the early stages of testing Sloppy 0.2.0, Alfil 8.1.1 and Learning Lemming 0.24.
Although Learning Lemming is still private, the author has signalled his intention of a future public release, so it will be interesting to see how this version fares.


Free Single CPU Engines

Rybka 1.0 64-bit is still top in this category ahead of Toga II 1.3.1.
It will be interesting to see if the recently released and highly touted Toga II 1.4 beta5c can finally dethrone Rybka.

Fruit 2.3.1 comes in third ahead of the evenly matched pair of Spike 1.2 Turin and Glaurung 2.0.1.

Naum 2.0 and Bright 0.2c are 40+ ELO further back.

Movei 00.8.438 (10 10 10), Alaric 707, Booot 4.14.0, Scorpio 2.0 and Frenzee Dec07 come in next, ahead of SlowChess Blitz WV2.1, E.T Chess 13.01.08, Delfi 5.2, Zappa 1.1, WildCat 7 and Pro Deo 1.6b.

Worthy of mention are:
Frenzee Dec07. Booot 4.14.0 and E.T Chess 13.01.08 which are all impressive improvements over previous versions.
Sloppy 0.2.0 and Alfil 8.1.1 - although we've only just started testing these latest versions, their progress will be worth keeping an eye on.
Hamsters 0.6 - Alessandro seems to have made a 400 ELO improvement as he has progressed from Hamsters 0.0.6 through to the latest version in roughly two years!
BugChess2 1.5.2 - Francois has also made astounding progress and this latest version is 200+ ELO ahead of BugChess2 1.4.1 which is just over one year old.

We test a very extensive range of amateur engines (currently ranging down to the 2000 ELO level) through a range of tournaments, all of which can be followed in our public forum.
Our aim is of course to ensure that all engines lower on our lists get 200+ games.


Blitz Notes

An update of the blitz lists is currently underway and should be available for viewing shortly after this post.

The 40/4 update is usually done separately to our 40/40 update.
The latest ratings can be found at one of the following links:
http://computerchess.org.uk/ccrl/404/
http://computerchess.org.uk/ccrl/404.live/

An enormous amount of work goes into the blitz list and it is well worth a visit.

Of special interest to some will be the best free 1CPU engines list which is being constructed through a systematic testing approach as mentioned here:
http://www.talkchess.com/forum/viewtopic.php?t=19206


FRC Notes

There are currently 25,200 games in the FRC 40/4 database.

Ray tests only those engines that can play FRC through the Shredder Classic GUI.
If engine authors have a new and stable version of their engine that will run under this GUI, they should contact Ray if they wish to see it tested.

There is nothing new to report this week.

Although Rybka 2.3.2 FRC is in top spot, it is a private engine.
Therefore Shredder 11 is the strongest available FRC engine, an impressive 80 ELO ahead of Hiarcs 11.1 and Naum 2.2.

For FRC the best list to look at is the pure list.
http://www.computerchess.org.uk/ccrl/404FRC/


Stats/Presentation Notes

The LOS (likelihood of superiority) stats to the right hand side of each rating list tell you the likelihood in percentage terms of each engine being superior to the engine directly below them.

A list of games played this week per engine can be found in the update thread in the CCRL public forum, accessible through the link given at the top of this post.

All games are available for download through the link given at the top of this post. They can be downloaded by engine or by month.
ELO ratings are now saved in all game databases for those engines that have 200 games or more.

Clicking on an engine name will give details as to opponents played plus homepage links where applicable.

Custom lists of engines can be selected for comparison.

An openings report page (link at bottom of index page) lists the number of games played by ECO codes with draw percentage and White win percentage. Clicking on a column heading will sort the list by that column.
Games can now be downloaded by ECO code.
gbanksnz at gmail.com
Kaj Soderberg

Re: CCRL update (8th February 2008)

Post by Kaj Soderberg »

Hi Graham, thanks again for the great work the CCRL team is doing.

What Zappa Mexico II testing is concerned, are you using Singular Extensions (default with the new version) or turning them off?

Best regards,

Kaj
User avatar
Graham Banks
Posts: 41432
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL update (8th February 2008)

Post by Graham Banks »

Kaj Soderberg wrote:Hi Graham, thanks again for the great work the CCRL team is doing.

What Zappa Mexico II testing is concerned, are you using Singular Extensions (default with the new version) or turning them off?

Best regards,

Kaj
Hi Kaj,

we're using the default settings for Zappa Mexico II.

Regards, Graham.
gbanksnz at gmail.com
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: CCRL update (8th February 2008)

Post by Norm Pollock »

Hi Graham,

Regarding the name change confusion with Zappa Mexico in 40/40.

Your team originally had Zappa Mexico update as Zappa Mexico II. Now that the official "Zappa Mexico II" was released as a new version, it appears that what ccrl previously called "Zappa Mexico II" is now called "Zappa Mexico".

If this is correct, may I ask why it was it wasn't changed to "Zappa Mexico I upd" or similar?

-Norm