CEGT - rating lists January 6th 2008

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Werner
Posts: 2594
Joined: Wed Mar 08, 2006 9:09 pm

CEGT - rating lists January 6th 2008

Post by Werner » Sun Jan 06, 2008 5:39 pm

Hi all :-),

our first updated rating lists in 2008 are online and can be found under the attached links.

40 / 120:
Our 40/120 quad-list contains now the new Glaurung 2.0.1 x64 engine. With 272 games the engine has a rating of 2858 - 14 points behind DeepJunior 10.1 More matches in progress can be found in our forum: http://husvankempen.de/nunn/phpBB2/view ... cb94fc0fec.

40 / 20:
This week we added over 2000 games to our list. New in the list are Frenzee Dez07, Queen 4.03 and a first CM 11 setting. See more in our list "Games of the week". In total our 40/20 list is based now on 213.264 games!

New engines:
We tested TheKing 350_64 2CPU with the Sauron setting made by Cock de Gorter. It has had promising results in his 60+15 tests. The engine is able to beat Fritz 11 or Shredder 11 - but unforunately looses more games as it wins. So the rating over all in our list is only 6 points better than default:

Code: Select all

98 The King 3.5 Sauron x64 2CPU 2775 +25 -25 485 games
102 The King 3.50 x64 2CPU 2769 +22 -22 561 games  
Cock will continue testing as he hopes to find a setting with much more improvement. We will see if this is again possible with the new King engine.

Sune Fischer has pleased us with 2 new versions of Frenzee. We already tested the Dez07 single version. After 300 games the engine is 66 elos better than the 3.0 version and now very close to 2700 elos.

Leen Amerald - after a few years time-out - releases continuously new versions of his engine Queen. The first new version 4.01 has had a good start and was clear better than version 3.09. The 4.02 then was not so strong. We hope with the 4.03 we now test he will reach again his first new engine - and we hope for no new releases in near future so we can make enough games with it.

Updated engines:

Glaurung 2.0.1 x64 4CPU did loose some points but 40 points in front of the 2CPU-engine is very good. Bright 0.2c 4CPU looses also some points but we do not have enough games with it.

40 / 4:
Our blitz-list was updated too with around 7.500 new games. We have now a lot of new engines in our list e.g.:

Code: Select all

Engine                  ELO/games       remarks
Rybka 2.3.2a x64 4CPU   3103 / 200      +49 on 2CPU / +80 on single
Glaurung 2.01 x64 2CPU  2843 / 910      +64 on Gl. e/4 x64 2CPU
Bright 0.2c 2CPU        2768 / 700      +52 on single
Hamsters 0.6 w32        2621 / 400      same as 40/20 >2600
TwistedLogic 20071202x  2490 / 400      
Matacz 1.3              2451 / 324      +67 on 1.02
Queen 4.02a             2419 / 700      both queen-versions surprisingly
Queen 4.01              2418 / 300      same strengh, different to 40/20
Alfil 7.6               2364 / 400      -33 on 6.4
Counter 0.7             2342 / 400      
Greko 5.3               2291 / 140      versions 5.4 + 5.5 waiting...
For the next weeks we plan to make more games with 4CPU engines and bing the top engines to around 1000 games.

A big „Thank you“ to all testers as usual! :)

40/20: http://www.husvankempen.de/nunn/rating.htm
Blitz: http://www.husvankempen.de/nunn/blitz.htm
40/120: http://www.husvankempen.de/nunn/rating120.htm
Tester: http://www.husvankempen.de/nunn/testers/testers.htm
Games of the week: http://www.husvankempen.de/nunn/40_40%2 ... on/gow.JPG
Elo-comparison: http://www.husvankempen.de/nunn/Replay/ ... arison.htm

Werner
CEGT Team

Alessandro Scotti

Re: CEGT - rating lists January 6th 2008

Post by Alessandro Scotti » Sun Jan 06, 2008 9:30 pm

Hi Werner,
great performance of Hamsters in blitz too, above the 2600 barrier even there! 8-)

ozziejoe
Posts: 811
Joined: Wed Mar 08, 2006 9:07 pm

Re: CEGT - rating lists January 6th 2008

Post by ozziejoe » Mon Jan 07, 2008 4:40 am

I hope you folks would test cm 11 with selectivity set to 21 and no other changes. As reported in anohter thread, i've found improvements in the order of 50 or 60 points. I have not had any such luck with other parameters, and am now not so convinced that changing hte king safety parameter is of much use.

best
J

User avatar
Graham Banks
Posts: 35064
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: CEGT - rating lists January 6th 2008

Post by Graham Banks » Mon Jan 07, 2008 5:29 am

ozziejoe wrote:I hope you folks would test cm 11 with selectivity set to 21 and no other changes. As reported in anohter thread, i've found improvements in the order of 50 or 60 points. I have not had any such luck with other parameters, and am now not so convinced that changing hte king safety parameter is of much use.

best
J
Ray is carrying out extensive CM11th testing at blitz including the default settings with selectivity 21.
Each setting will get 600 games against a variety of non-CM engines.
We'll let you know how it goes.

Doing extensive testing of various settings at 40/20 takes a lot of cpu time that might better be spent on other engines.
It would be fantastic if CEGT were to do it, but don't be surprised if they don't.

Regards, Graham.
gbanksnz at gmail.com

ozziejoe
Posts: 811
Joined: Wed Mar 08, 2006 9:07 pm

Re: CEGT - rating lists January 6th 2008

Post by ozziejoe » Mon Jan 07, 2008 10:36 am

I'm glad to hear ray is carrying out these tests. I have not found a selectivity level that is better than 21. I have gone as high as 40 and king actually does fine at this level. King sel= 40 plays differently than sel=21 but it looks like strength is about the same.

The big jump in rating do to sel =21 (hopefully ray replicates) suggests that cm is far from optimized. So lets hope we can get a version that is at least at toga/fruit level

User avatar
Werner
Posts: 2594
Joined: Wed Mar 08, 2006 9:09 pm

Re: CEGT - rating lists January 6th 2008

Post by Werner » Mon Jan 07, 2008 3:44 pm

...and the author of Queen is of course:

Leen Ammeraal

I am very sorry for that fault!

Werner

jdart
Posts: 4101
Joined: Fri Mar 10, 2006 4:23 am
Location: http://www.arasanchess.org

Re: CEGT - rating lists January 6th 2008

Post by jdart » Mon Jan 07, 2008 6:53 pm

The 40/20 list looks more or less right for Arasan - with 10.0 the highest, and the 9.x versions behind it (but quite close to each other). But I notice Arasan 8.2 is the highest-rated Arasan version in Blitz, which doesn't make much sense to me - not enough games for an accurate ranking I guess.

--Jon

Wolfgang
Posts: 370
Joined: Fri May 12, 2006 11:08 pm

Re: CEGT - rating lists January 6th 2008

Post by Wolfgang » Tue Jan 08, 2008 11:43 am

Graham Banks wrote:.....
Doing extensive testing of various settings at 40/20 takes a lot of cpu time that might better be spent on other engines.
I totally agree with that, Graham. You need at least 500 games to have a halfway reliable rating. With 40/20 this takes more than 3 weeks considering that you can play 20-30 games per day with this time control and that the machines run 24/7. Ok, with faster machines that A64-4200+ you can produce more games but even two weeks for one setting is waste of time in my eyes.

It would be fantastic if CEGT were to do it, but don't be surprised if they don't.
I do not see many CM-fans in our team, so I think "they don't"... :D
Best
Wolfgang
CEGT-Team

Wolfgang
Posts: 370
Joined: Fri May 12, 2006 11:08 pm

Re: CEGT - rating lists January 6th 2008

Post by Wolfgang » Tue Jan 08, 2008 11:57 am

jdart wrote:The 40/20 list looks more or less right for Arasan - with 10.0 the highest, and the 9.x versions behind it (but quite close to each other). But I notice Arasan 8.2 is the highest-rated Arasan version in Blitz, which doesn't make much sense to me - not enough games for an accurate ranking I guess.

--Jon
I agree, Jon, not enough games so far. But at the moment we make a lot of "rating-list-cosmetics" in our blitz-list with 500 games for each engine in mind (1000 games for the Top100). Unfortunately this takes much time....
Best
Wolfgang
CEGT-Team

Post Reply