CCRL update (28th September 2007)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

User avatar
Graham Banks
Posts: 41435
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

CCRL update (28th September 2007)

Post by Graham Banks »

The September 28th update of the CCRL Rating Lists and Statistics is now available for viewing at:
http://www.computerchess.org.uk/ccrl/4040/

The links to the various rating lists can be found just beneath the default Best Versions list.
For example there is a 32-bit Single CPU list.

Our standard testing is at 40 moves in 40 minutes repeating while our current blitz testing is at both 40 moves in 4 minutes repeating and 40 moves in 12 minutes repeating, all adjusted to the AMD64 X2 4600+ (2.4GHz).

Currently active testers in our team are:
Graham Banks, Ray Banks, Shaun Brewer, Kirill Kryukov, Dom Leste, Tom Logan, Andreas Schwartmann, Charles Smith, George Speight, Chris Taylor, Chuck Wilson, Gabor Szots and Martin Thoresen.

A big thanks to all testers as usual for their efforts this week.


40/40 Notes

There currently 77,370 games in our 40/40 database.

Many engines on our list have few games and in many cases their ratings are likely to fluctuate (markedly for some) until a lot more games are played. Therefore no conclusions should be drawn about their strength yet.
To illustrate this point, when an engine has 200 games played, the error margin is still approximately +-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
This of course highlights the importance of looking at other rating lists that are also available in order to draw comparisons and get a more accurate overall picture.


Multi CPU Engines

Testing of Zappa Mexico 64-bit 4CPU is still in its infancy, but after 187 games it lies about 60 ELO behind Rybka 2.3.2a 64-bit 4CPU.

Naum 2.2 64-bit 4CPU is in third spot a further 50 ELO back and Hiarcs 11.1 4CPU in fourth is not too far behind.
We hope to give Hiarcs 11.2 64-bit 4CPU more games although all indications are that it will not surpass its predecessor, and with Hiarcs 12 supposedly not far away, it is not a priority at present.

Loop M1-T 64-bit 4CPU is next in the pecking order, ahead of the oldies - Deep Shredder 10 64-bit 4CPU, Deep Fritz 10 4CPU and Deep Junior 10 4CPU.
On our growing list of things to do, we would like to give more games to Deep Fritz 10 2CPU to see whether performs better than Deep Fritz 10 4CPU as has been reported elsewhere.

Glaurung 2 epsilon/5 64-bit 2CPU is the strongest free engine on this list.


Single CPU Engines

Rybka 2.3.2a leads the ratings here as well, although by a slightly larger margin.
The 64-bit version has a good edge over the 32-bit version.

As more games are played, we would expect Zappa Mexico to cement the second spot that it currently holds.

Toga II 1.3.1, Loop M1-T, Fruit 051103, Hiarcs 11.1 and Naum 2.2 all appear to be close in strength, and are ahead of Fritz 10, Shredder 10 and Strelka 1.8.
Hiarcs 11.2 now has 300+ games and remains lower in rating than Hiarcs 11.1.

With 400+ games now under its belt, Deep Sjeng 2.7 has disappointingly slipped back to around the level of Spike 1.2 Turin and Junior 10.
According to our testing, Junior 10.1 is weaker than Junior 10, and somebody noted that it no longer seemed to be available.

Ktulu 8.0, SmarThink 1.00, Glaurung 2 epsilon/5 and Chess Tiger 2007.1 are 40-50 ELO further back.


Free Single CPU Engines

Rybka 1.0 narrowly retains its crown as the top free engine ahead of Toga II 1.3.1.

Fruit 051103 seems to the strongest Fruit ahead of 2.3.1, but both need many more games.

Strelka 1.8 and Spike 1.2 Turin come in next, well ahead of Glaurung 2 epsilon/5 and Naum 2.0 which are likewise well ahead of Scorpio 1.91.

Alaric 707, Movei 0.08.438, SlowChess Blitz WV2.1, Delfi 5.1, Zappa 1.1, WildCat 7, Pro Deo 1.2 and List 512 are within a 30 ELO point range of each other.
With 600+ games now under its belt, Movei 0.08.438 is a massive 100+ ELO improvement over previous versions. Well done Uri!

The latest version of BugChess2 is also worth keeping an eye on as it too seems to have made noticeable gains. It requires many more games though before this can be confirmed.

We test a very extensive range of amateur engines through our Amateur Championship divisions (32-bit 1CPU) plus other tournaments, all of which can be followed in our public forum.

Our aim is of course to ensure that all engines lower on our lists get 200+ games.


Blitz Notes

There are currently 172,333 games in our 40/4 database.

The 40/4 update is usually done separately to our 40/40 update.
The latest ratings can be found here:
http://computerchess.org.uk/ccrl/404/

An updated list will be available during the next few days.


FRC Notes

There are currently 20,600 games in the FRC 40/4 database.

Ray tests only those engines that can play FRC through the Shredder Classic GUI.
If engine authors have a new and stable version of their engine that will run under this GUI, they should contact Ray if they wish to see it tested.

Although Rybka 2.3.2 FRC tops the list, it is a private engine, therefore Hiarcs 11.1 is still the best available FRC engine.

Ray has recently tested Rybka 2.3.2, Hiarcs 11.2, Naum 2.2, Fruit 2.3, Fruit 051103, Hamsters 0.4, Hermann 2.0, Movei 0.08.438 and Deep Sjeng 2.7.
All are now included in the ratings.

For FRC the best list to look at is the pure list.
http://www.computerchess.org.uk/ccrl/404FRC/


Stats/Presentation Notes

The LOS stats to the right hand side of each rating list are "likelihood of superiority" stats. They tell you the likelihood in percentage terms of each engine being superior to the engine directly below them.

A list of games played this week per engine can be found in the update thread in the CCRL public forum, accessible through the link given at the top of this post.

All games are available for download through the link given at the top of this post. They can be downloaded by engine or by month.
ELO ratings are now saved in all game databases for those engines that have 200 games or more.

Clicking on an engine name will give details as to opponents played plus homepage links where applicable.
Clicking on an engine name now also gives you a ratings history graph for that engine over time (a bit further down the page). The green line is the actual rating. The red lines are the upper and lower error bars, and the blue line represents the number of games.

Custom lists of engines can be selected for comparison.

An openings report page (link at bottom of index page) lists the number of games played by ECO codes with draw percentage and White win percentage. Clicking on a column heading will sort the list by that column.
Games can now be downloaded by ECO code.
gbanksnz at gmail.com
User avatar
Werner
Posts: 2871
Joined: Wed Mar 08, 2006 10:09 pm
Location: Germany
Full name: Werner Schüle

Re: CCRL update (28th September 2007)

Post by Werner »

Hi Graham,
nice report.
I think we will have the same ranking :)

Rybka 2.32a 4CPU
Rybka 2.32a 2CPU
Zappa Mexico 4CPU
Zappa Mexico 2CPU

and the Zap!Chess versions a little bit under the Zappa versions.

I had the same result after 12 games on my quad tests as Andreas now has on his online match:

Code: Select all

1   Rybka 2.3.2a mp  x64 4CPU   11½111½½½½11   9.5/12
2   Zappa Mexico X64 4CPU      00½000½½½½00    2.5/12

BTW: do you test Hiarcs 11.2 64-bit 4CPU, with 64bit engine?
Werner
User avatar
Graham Banks
Posts: 41435
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL update (28th September 2007)

Post by Graham Banks »

Werner wrote:Hi Graham,
nice report.
I think we will have the same ranking :)

Rybka 2.32a 4CPU
Rybka 2.32a 2CPU
Zappa Mexico 4CPU
Zappa Mexico 2CPU

and the Zap!Chess versions a little bit under the Zappa versions.

I had the same result after 12 games on my quad tests as Andreas now has on his online match:

Code: Select all

1   Rybka 2.3.2a mp  x64 4CPU   11½111½½½½11   9.5/12
2   Zappa Mexico X64 4CPU      00½000½½½½00    2.5/12

BTW: do you test Hiarcs 11.2 64-bit 4CPU, with 64bit engine?
Thanks Werner. :)

We've only got 109 games for Hiarcs 11.2 64-bit 4CPU and it's unlikely to get priority any time soon due to the reasons mentioned.

It's good to see that most CEGT and CCRL ratings are consistent with each other.
We follow your results with interest.

George and I have both started Zappa Mexico 32-bit 1CPU gauntlets, so it will be interesting to see how it fares with 1CPU.

Regards, Graham.
gbanksnz at gmail.com
Dariusz Orzechowski

Re: CCRL update (28th September 2007)

Post by Dariusz Orzechowski »

Hi Graham,

Thanks for your continuous effort with this rating list!

I have two suggestions. The multi-CPU list unfortunately starts to clutter and is a little hard to read. Maybe it would be a good idea to create separate list for 2 and 4 CPU's? Besides, it would be interesting to have a separate table with comparison of ratings for 1, 2 and 4 (in prospect maybe also 8) CPU. Maybe something in the following format:

Code: Select all

            Name            |       Rating       |
                            | 1CPU | 2CPU | 4CPU |
-------------------------------------------------|
Rybka 2.3.1a 64-bit         | 3068 | 3098 | 3114 |
Zap!Chess Zanzibar 64-bit   | 2904 | 3026 | 3050 |
Regards,
Darek
User avatar
Graham Banks
Posts: 41435
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL update (28th September 2007)

Post by Graham Banks »

Dariusz Orzechowski wrote:Hi Graham,

Thanks for your continuous effort with this rating list!

I have two suggestions. The multi-CPU list unfortunately starts to clutter and is a little hard to read. Maybe it would be a good idea to create separate list for 2 and 4 CPU's? Besides, it would be interesting to have a separate table with comparison of ratings for 1, 2 and 4 (in prospect maybe also 8) CPU. Maybe something in the following format:

Code: Select all

            Name            |       Rating       |
                            | 1CPU | 2CPU | 4CPU |
-------------------------------------------------|
Rybka 2.3.1a 64-bit         | 3068 | 3098 | 3114 |
Zap!Chess Zanzibar 64-bit   | 2904 | 3026 | 3050 |
Regards,
Darek
I've suggested to Kirill that we should have 4CPU and 2CPU lists available separately and it's on his "to do" list.
In fact, it's been implemented in the FRC ratings, but we need to carry it over into the 40/40 and 40/4 lists also.

Regards, Graham.
gbanksnz at gmail.com
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: CCRL update (28th September 2007)

Post by Norm Pollock »

Hi Graham,

Why isn't Strelka on the list of "killed" engines? It seems to me that she is not an asset to your rating lists.

-Norm
User avatar
Denis P. Mendoza
Posts: 415
Joined: Fri Dec 15, 2006 9:46 pm
Location: Philippines

Re: CCRL update (28th September 2007)

Post by Denis P. Mendoza »

Better read the Winboard Forum of Volker. Scorpio beta 2 is out as Daniel found a new homepage. He just updated it on his site recently. This is another one to test. FYI.

http://dshawul.googlepages.com/home
http://dshawul.googlepages.com/scorpio_beta.ZIP
User avatar
Graham Banks
Posts: 41435
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL update (28th September 2007)

Post by Graham Banks »

Norm Pollock wrote:Hi Graham,

Why isn't Strelka on the list of "killed" engines? It seems to me that she is not an asset to your rating lists.

-Norm
We've had this discussion before Norm.

Regards, Graham.
gbanksnz at gmail.com
User avatar
Graham Banks
Posts: 41435
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: CCRL update (28th September 2007)

Post by Graham Banks »

Dariusz Orzechowski wrote:Maybe it would be a good idea to create separate list for 2 and 4 CPU's?
Now available if you view the latest "live" update.
http://www.computerchess.org.uk/ccrl/4040.live/
gbanksnz at gmail.com
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: CCRL update (28th September 2007)

Post by geots »

Norm Pollock wrote:Hi Graham,

Why isn't Strelka on the list of "killed" engines? It seems to me that she is not an asset to your rating lists.

-Norm
Im not Graham, but i believe whether or not Strelka is an asset to our group is for our group to decide.