CCRL update (28th July 2007)

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Graham Banks
Posts: 35064
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

CCRL update (28th July 2007)

Post by Graham Banks » Fri Jul 27, 2007 9:42 pm

The July 28th update of the CCRL Rating Lists and Statistics is now available for viewing at:
http://www.computerchess.org.uk/ccrl/4040/

The links to the various rating lists can be found just beneath the default Best Versions list.
For example there is a 32-bit Single CPU list.

Our standard testing is at 40 moves in 40 minutes repeating while our current blitz testing is at both 40 moves in 4 minutes repeating and 40 moves in 12 minutes repeating, all adjusted to the AMD64 X2 4600+ (2.4GHz).

Currently active testers in our team are:
Graham Banks, Ray Banks, Shaun Brewer, Kirill Kryukov, Dom Leste, Tom Logan, Andreas Schwartmann, Charles Smith, George Speight, Chris Taylor, Chuck Wilson, Gabor Szots and Martin Thoresen.

A big thanks to all testers as usual for their efforts this week.


40/40 Notes

There currently 67,506 games in our 40/40 database.

Many engines on our list have few games and in many cases their ratings are likely to fluctuate (markedly for some) until a lot more games are played. Therefore no conclusions should be drawn about their strength yet.
To illustrate this point, when an engine has 200 games played, the error margin is still approximately +/-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
This of course highlights the importance of looking at other rating lists that are also available in order to draw comparisons and get a more accurate overall picture.


Multi CPU Engines

Rybka 2.3.2 64-bit 4CPU continues to hold a tiny lead over Rybka 2.2 64-bit 4CPU.
Interestingly, the improvement is greater on 2CPU.

Zap!Chess Zanzibar 64-bit 4CPU is clearly number 2 ahead of Hiarcs 11.1 4CPU.
Naum 2.1 64-bit 4CPU and Loop M1-T 64-bit 4CPU are the next two in ranking order.

We will start testing Naum 2.2 this week and will be looking forward to seeing if it can challenge Zanzibar's number two spot.
It should also be noted that Hiarcs 11.2 has just been released, so there could well be a strength improvement there also.

Deep Shredder 10 64-bit 4CPU, Deep Fritz 10 4CPU and Deep Junior 10 4CPU, are off the pace.


Single CPU Engines

Rybka 2.3.2 leads the ratings here as well, although by a slightly larger margin.

Toga II 1.3 Beta 1 has now taken over second spot ahead of Zap!Chess Zanzibar!
We have started testing Fruit 051103 and Fruit 2.3, so it will be interesting to see where they stand in comparison.

Hiarcs 11.1, Loop 13.6, Fritz 10 and Shredder 10 are the next three in the ranking order.
We will start testing Naum 2.2 this week and it could well be amongst this group.
Hiarcs 11.2 has also just been released.

The controversial Strelka 1.0b is slightly stronger than Spike 1.2 Turin.
Strelka 1.8 does not have enough games to make a statement regarding a strength improvement.

Spike 1.2 Turin, Junior 10, Naum 2.1, Deep Sjeng 2.5 and Fruit 2.2.1 are the next group of engines and are very even in strength.

Ktulu 8.0 and Chess Tiger 2007.1 are further adrift.


Amateur News

The final release of Toga II 1.3 will rival Rybka 1.0 as the strongest free engine.
Of course Fruit 051103 and Fruit 2.3 could well surpass both!

Strelka 1.0b is a little stronger than Spike 1.2 Turin.

Glaurung 2 epsilon/5 is the strongest version that Tord has released to date.

Although we've just started testing Alaric 707, it is expected to be stronger than the next group of engines - Scorpio 1.91, Delfi 5.1 and SlowChess Blitz WV2.1.

WildCat 7 and Pro Deo 1.2 are further back.

As we make our way down the list, it should be noted that the most recent versions of Ufim, DanaSah, Delphil, Hermann, Alfil, Popochin, Natwarlal and Feuerstein seem to have made good gains over previous versions.
The new NanoSzachy is also one to keep an eye on.

We test a very extensive range of amateur engines through our Amateur Championship divisions (32-bit 1CPU) plus other tournaments, all of which can be followed in our public forum.

Our aim is of course to ensure that all engines lower on our lists get at least 200 games.

It should be noted that while the latest version of Matacz will load in the Chessbase GUIs on Pentiums, it does not seem to do so on Athlons.
The latest version of Twisted Logic is untested yet due to reported problems with frequent losses on time.
Counter 0.2 loses on time far too frequently. We haven't tested the latest version yet to see if the problem has been addressed.


Blitz Notes

There are currently 153,787 games in our 40/4 database.

The 40/4 update is usually done separately to our 40/40 update. The most recent update can always be viewed here:
http://computerchess.org.uk/ccrl/404.live/


FRC Notes

Ray tests only those engines that can play FRC through the Shredder Classic GUI.
If engine authors have a new and stable version of their engine that will run under this GUI, they should contact Ray if they wish to see it tested.
Fruit 2.3 is currently being tested for inclusion.

For FRC the best list to look at is the pure list.
http://www.computerchess.org.uk/ccrl/404FRC/


Stats/Presentation Notes

The LOS stats to the right hand side of each rating list are "likelihood of superiority" stats. They tell you the likelihood in percentage terms of each engine being superior to the engine directly below them.

A list of games played this week per engine can be found in the update thread in the CCRL public forum, accessible through the link given at the top of this post.

All games are available for download through the link given at the top of this post. They can be downloaded by engine or by month.
ELO ratings are now saved in all game databases for those engines that have 200 games or more.

Clicking on an engine name will give details as to opponents played plus homepage links where applicable.

Custom list selections now have the option of including or excluding betas, private engines, settings and others.

An openings report page (link at bottom of index page) lists the number of games played by ECO codes with draw percentage and White win percentage. Clicking on a column heading will sort the list by that column.
Games can now be downloaded by ECO code.

Norm Pollock
Posts: 1031
Joined: Thu Mar 09, 2006 3:15 pm
Location: Long Island, NY, USA
Contact:

Re: CCRL update (28th July 2007)

Post by Norm Pollock » Sat Jul 28, 2007 11:17 pm

Code: Select all

[Event "CCRL 40/40"]
[Site "CCRL"]
[Date "2007.07.11"]
[Round "?"]
[White "Glaurung 2 epsilon/5 64-bit"]
[Black "Glaurung 2 epsilon/5 64-bit"]
[Result "1/2-1/2"]
[ECO "A49"]
[Opening "King's Indian"]
[Variation "fianchetto without c4"]
[PlyCount "75"]

1. d4 Nf6 2. Nf3 g6 3. g3 Bg7 4. c4 O-O 5. Bg2 d6 6. Nc3 c6 7. O-O Nbd7 8. b3
e5 9. dxe5 Nxe5 10. Nxe5 dxe5 11. Ba3 Re8 12. Qxd8 Rxd8 13. Rad1 Bf5 14. Be7
Re8 15. Bc5 Bf8 16. Bxf8 Kxf8 17. Na4 Rad8 18. Nc5 Bc8 19. Rfe1 b6 20. Ne4 Nxe4
21. Bxe4 Bb7 22. f3 Ke7 23. Kf2 Ke6 24. c5 f5 25. Bb1 bxc5 26. Rc1 Rd5 27. e4
Rd2+ 28. Re2 Rxe2+ 29. Kxe2 c4 30. bxc4 c5 31. Rc3 fxe4 32. fxe4 Bc8 33. Rb3
Re7 34. Bd3 Kd6 35. Ke3 Be6 36. Be2 Rd7 37. h4 Rf7 38. Rd3+ 1/2-1/2


User avatar
Graham Banks
Posts: 35064
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: CCRL update (28th July 2007)

Post by Graham Banks » Sat Jul 28, 2007 11:52 pm

Norm Pollock wrote:

Code: Select all

[Event "CCRL 40/40"]
[Site "CCRL"]
[Date "2007.07.11"]
[Round "?"]
[White "Glaurung 2 epsilon/5 64-bit"]
[Black "Glaurung 2 epsilon/5 64-bit"]
[Result "1/2-1/2"]
[ECO "A49"]
[Opening "King's Indian"]
[Variation "fianchetto without c4"]
[PlyCount "75"]

1. d4 Nf6 2. Nf3 g6 3. g3 Bg7 4. c4 O-O 5. Bg2 d6 6. Nc3 c6 7. O-O Nbd7 8. b3
e5 9. dxe5 Nxe5 10. Nxe5 dxe5 11. Ba3 Re8 12. Qxd8 Rxd8 13. Rad1 Bf5 14. Be7
Re8 15. Bc5 Bf8 16. Bxf8 Kxf8 17. Na4 Rad8 18. Nc5 Bc8 19. Rfe1 b6 20. Ne4 Nxe4
21. Bxe4 Bb7 22. f3 Ke7 23. Kf2 Ke6 24. c5 f5 25. Bb1 bxc5 26. Rc1 Rd5 27. e4
Rd2+ 28. Re2 Rxe2+ 29. Kxe2 c4 30. bxc4 c5 31. Rc3 fxe4 32. fxe4 Bc8 33. Rb3
Re7 34. Bd3 Kd6 35. Ke3 Be6 36. Be2 Rd7 37. h4 Rf7 38. Rd3+ 1/2-1/2

Thanks Norm,

we'll look into it and get it corrected.
What has happened is that whoever ran the game obviously erred when editing the engine names in the pgn.
We don't run an engine against itself in this manner.

Regards, Graham.

ernest
Posts: 1928
Joined: Wed Mar 08, 2006 7:30 pm

Re: CCRL update (28th July 2007)

Post by ernest » Sat Jul 28, 2007 11:56 pm

Graham Banks wrote:To illustrate this point, when an engine has 200 games played, the error margin is still approximately +/-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
Hi Graham,
Am I right to assume this corresponds to games with 1/3 draws and error margin of 2 standard deviations (95% probability)?

User avatar
Graham Banks
Posts: 35064
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: CCRL update (28th July 2007)

Post by Graham Banks » Sun Jul 29, 2007 12:09 am

ernest wrote:
Graham Banks wrote:To illustrate this point, when an engine has 200 games played, the error margin is still approximately +/-40 ELO, after 500 games +-25 ELO, after 1000 games +-17 ELO and even after 2000 games there is a +-13 ELO error margin!
Hi Graham,
Am I right to assume this corresponds to games with 1/3 draws and error margin of 2 standard deviations (95% probability)?
Hi Ernest,

I'll get somebody who is more of an expert with the stats to answer this.

Regards, Graham.

Post Reply