CEGT - rating lists March 11th 2012

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Harvey Williamson, bob

User avatar
Werner
Posts: 2231
Joined: Wed Mar 08, 2006 9:09 pm

CEGT - rating lists March 11th 2012

Post by Werner » Sun Mar 11, 2012 11:32 am

Hi all, :D

our actual rating lists are online and can be found under the attached links.

40 / 20:
New games: 1550 ; 56 different engines
Total: 577.472

NEW Engines

516 Gaviota 0.85.1 x64 1CPU: 2554 - 415 games (+37 to v. 0.84 - similar to our blitz-list)
632 Arasan 14.0 x64 1CPU: 2478 - 590 games (here we can measure no differences to versions 13.4 and 13.3 - we have +12 to 13.3 in our blitz list)
656 Philou 3.71 x64 1CPU: 2464 - 50 games (start-rating; no problems now with 2 matches on the same pc)

UPDATES
76 Deep Junior 13 x64 4CPU: 2890 - 1342 games (+1)
38 Critter 1.4 w32 1CPU: 2967 - 600 games (-1)
553 Delfi 5.4 1CPU: 2518 - 845 games (+8)
287 Protector 1.4.0 w32 1CPU: 2707 - 750 games (+3)

40 / 4:
New games: 6050
All games now: 984.760

New Engines
458 Gaviota 0.85.1 x64 1CPU: 2562 - 1000 games (+41 to v. 0.84)
525 Gaviota 0.85.1 w32 1CPU: 2509 - 500 games (no prev. version here)
673 Arasan 14.0 x64 1CPU: 2438 - 1000 games (+12 to v. 13.3 - inside the error-bar)
205 Quazar 0.4 x64 1CPU: 2743 - 1200 games (here too +331 to version 0.3!)
999 Phalanx XXIII w32: 2205 - 1200 games (+9 to prev. version)

Updates
3 Critter 1.4 x64 4CPU : 3063 - 2400 games (+1)
988 Rotor 0.3 : 2221 - 1010 games (+-0)

40/120
See here our new single-list (updated with 7350 games):
http://www.husvankempen.de/nunn//40120n ... liste.html
Now with Stockfish 2.2.2 x64 and Deep Junior 13 x64

40/20 pb=on
We have started a new list with permanent brain and 1CPU and no time adjustment but similar pcs. We now have 1320 games. The list is updated permanently.

Code: Select all

1 Houdini 2.0c x64 2986 33 33 280
2 Critter 1.4 x64 2955 29 29 320
3 Stockfish 2.2.2 x64 2946 41 41 160
4 Naum 4.2 x64 2817 29 29 320
5 Deep Sjeng ct 2010 2814 30 30 280
6 Deep Junior 13 x64 2807 29 29 320
7 Deep Shredder 12 x64 2800 27 27 360
8 Spike 1.4 2772 32 32 280
9 Fritz 12 2746 81 81 40
10 Zappa Mexico II x64 2716 33 33 280

http://www.husvankempen.de/nunn/rating4020PBON.htm
A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.husvankempen.de/nunn/rating.htm
Blitz: http://www.husvankempen.de/nunn/blitz.htm
40/120: http://www.husvankempen.de/nunn/rating120.htm
Tester: http://www.husvankempen.de/nunn/testers/testers.htm
40/20 pb=on: http://www.husvankempen.de/nunn/rating4020PBON.htm
Games of the week: http://www.husvankempen.de/nunn/40_40%2 ... on/gow.jpg

Werner Schuele
CEGT-Team

IWB
Posts: 1538
Joined: Thu Mar 09, 2006 1:02 pm

Re: CEGT - rating lists March 11th 2012

Post by IWB » Sun Mar 11, 2012 4:56 pm

Hello Werner

That is interesting:
Werner wrote: 40/20 pb=on
We have started a new list with permanent brain and 1CPU and no time adjustment but similar pcs. We now have 1320 games. The list is updated permanently.

Code: Select all

1 Houdini 2.0c x64 2986 33 33 280
2 Critter 1.4 x64 2955 29 29 320
3 Stockfish 2.2.2 x64 2946 41 41 160
4 Naum 4.2 x64 2817 29 29 320
5 Deep Sjeng ct 2010 2814 30 30 280
6 Deep Junior 13 x64 2807 29 29 320
7 Deep Shredder 12 x64 2800 27 27 360
8 Spike 1.4 2772 32 32 280
9 Fritz 12 2746 81 81 40
10 Zappa Mexico II x64 2716 33 33 280

http://www.husvankempen.de/nunn/rating4020PBON.htm
Di8fferent hardware (:-(), but besides that I see some similarities... More games are needed, but it is a start! Nice, who is doing this?

Thanks for that
Ingo

User avatar
Werner
Posts: 2231
Joined: Wed Mar 08, 2006 9:09 pm

Re: CEGT - rating lists March 11th 2012

Post by Werner » Sun Mar 11, 2012 4:59 pm

IWB wrote:Hello Werner
That is interesting:
Different hardware (:-(), but besides that I see some similarities... More games are needed, but it is a start! Nice, who is doing this?

Thanks for that
Ingo
Hi Ingo,
Gerhard and Wolfgang started with games for this new list.

Werner

User avatar
michiguel
Posts: 6246
Joined: Thu Mar 09, 2006 7:30 pm
Location: Chicago, Illinois, USA
Contact:

Re: CEGT - rating lists March 11th 2012

Post by michiguel » Sun Mar 11, 2012 5:17 pm

Werner wrote:Hi all, :D

our actual rating lists are online and can be found under the attached links.

40 / 20:
New games: 1550 ; 56 different engines
Total: 577.472

NEW Engines

516 Gaviota 0.85.1 x64 1CPU: 2554 - 415 games (+37 to v. 0.84 - similar to our blitz-list)
632 Arasan 14.0 x64 1CPU: 2478 - 590 games (here we can measure no differences to versions 13.4 and 13.3 - we have +12 to 13.3 in our blitz list)
656 Philou 3.71 x64 1CPU: 2464 - 50 games (start-rating; no problems now with 2 matches on the same pc)

UPDATES
76 Deep Junior 13 x64 4CPU: 2890 - 1342 games (+1)
38 Critter 1.4 w32 1CPU: 2967 - 600 games (-1)
553 Delfi 5.4 1CPU: 2518 - 845 games (+8)
287 Protector 1.4.0 w32 1CPU: 2707 - 750 games (+3)

40 / 4:
New games: 6050
All games now: 984.760

New Engines
458 Gaviota 0.85.1 x64 1CPU: 2562 - 1000 games (+41 to v. 0.84)
525 Gaviota 0.85.1 w32 1CPU: 2509 - 500 games (no prev. version here)
673 Arasan 14.0 x64 1CPU: 2438 - 1000 games (+12 to v. 13.3 - inside the error-bar)
205 Quazar 0.4 x64 1CPU: 2743 - 1200 games (here too +331 to version 0.3!)
999 Phalanx XXIII w32: 2205 - 1200 games (+9 to prev. version)

Updates
3 Critter 1.4 x64 4CPU : 3063 - 2400 games (+1)
988 Rotor 0.3 : 2221 - 1010 games (+-0)

40/120
See here our new single-list (updated with 7350 games):
http://www.husvankempen.de/nunn//40120n ... liste.html
Now with Stockfish 2.2.2 x64 and Deep Junior 13 x64

40/20 pb=on
We have started a new list with permanent brain and 1CPU and no time adjustment but similar pcs. We now have 1320 games. The list is updated permanently.

Code: Select all

1 Houdini 2.0c x64 2986 33 33 280
2 Critter 1.4 x64 2955 29 29 320
3 Stockfish 2.2.2 x64 2946 41 41 160
4 Naum 4.2 x64 2817 29 29 320
5 Deep Sjeng ct 2010 2814 30 30 280
6 Deep Junior 13 x64 2807 29 29 320
7 Deep Shredder 12 x64 2800 27 27 360
8 Spike 1.4 2772 32 32 280
9 Fritz 12 2746 81 81 40
10 Zappa Mexico II x64 2716 33 33 280

http://www.husvankempen.de/nunn/rating4020PBON.htm
A big „Thank you“ to all testers as usual!!

Links

40/20: http://www.husvankempen.de/nunn/rating.htm
Blitz: http://www.husvankempen.de/nunn/blitz.htm
40/120: http://www.husvankempen.de/nunn/rating120.htm
Tester: http://www.husvankempen.de/nunn/testers/testers.htm
40/20 pb=on: http://www.husvankempen.de/nunn/rating4020PBON.htm
Games of the week: http://www.husvankempen.de/nunn/40_40%2 ... on/gow.jpg

Werner Schuele
CEGT-Team
Thanks a lot. The Gaviota's improvement is within the expectations, and the testing of the 32 bit and its comparison with the 64 bit is particularly informative, together with the blitz vs a longer time control.

Miguel

ThatsIt
Posts: 782
Joined: Thu Mar 09, 2006 1:11 pm
Contact:

Re: CEGT - rating lists March 11th 2012

Post by ThatsIt » Mon Mar 12, 2012 8:22 am

IWB wrote: Di8fferent hardware (:-(), but besides that I see some similarities... More games are needed, but it is a start! Nice, who is doing this?
Thanks for that
Ingo
Hi Ingo !

Slightly different harware and insignificant for the measurements.

Best wishes,
G.S.

IWB
Posts: 1538
Joined: Thu Mar 09, 2006 1:02 pm

Re: CEGT - rating lists March 11th 2012

Post by IWB » Mon Mar 12, 2012 9:26 am

Hello Gerhard,
ThatsIt wrote:
IWB wrote: Di8fferent hardware (:-(), but besides that I see some similarities... More games are needed, but it is a start! Nice, who is doing this?
Thanks for that
Ingo
Hi Ingo !

Slightly different harware and insignificant for the measurements.

Best wishes,
G.S.
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM

The i5 is the fastest, I am not sure if the AMD or the 2.3GHz 8200 are the slowest computers (I go for the Q8200). My guess is that there is a difference of about 35 to 40%. If that is relevant ... I dont know. I personaly like it better that way (to stick within a certain range of hardware as long as all matches are on the same hardware) than to "adapt" different speeds with a questionable benchmark!

Than you state "sse4.2 if available". As you cant run SSE on all that machines you are mixing that - unfortunately ...

Are you using books or starting positions for that?

Anyhow, thanks for the list
Ingo
Last edited by IWB on Mon Mar 12, 2012 9:31 am, edited 2 times in total.

User avatar
geots
Posts: 4790
Joined: Fri Mar 10, 2006 11:42 pm

Re: CEGT - rating lists March 11th 2012

Post by geots » Mon Mar 12, 2012 9:27 am

ThatsIt wrote:
IWB wrote: Di8fferent hardware (:-(), but besides that I see some similarities... More games are needed, but it is a start! Nice, who is doing this?
Thanks for that
Ingo
Hi Ingo !

Slightly different harware and insignificant for the measurements.

Best wishes,
G.S.


I have been patiently waiting for your updated blitz list. Thanks, Gerhard.


george

ThatsIt
Posts: 782
Joined: Thu Mar 09, 2006 1:11 pm
Contact:

Re: CEGT - rating lists March 11th 2012

Post by ThatsIt » Mon Mar 12, 2012 10:38 am

Hi Ingo !
IWB wrote: Than you state "sse4.2 if available". As you cant run SSE on all
that machines you are mixing that - unfortunately ...
Perhabs +- 5 points (?), so what?
IWB wrote: Are you using books or starting positions for that?
Both.
IWB wrote: Anyhow, thanks for the list.
No matter.

Best wishes,
G.S.

IWB
Posts: 1538
Joined: Thu Mar 09, 2006 1:02 pm

Re: CEGT - rating lists March 11th 2012

Post by IWB » Mon Mar 12, 2012 11:04 am

ThatsIt wrote:Hi Ingo !
IWB wrote: Than you state "sse4.2 if available". As you cant run SSE on all
that machines you are mixing that - unfortunately ...
Perhabs +- 5 points (?), so what?
In general you are right ... but I am sure that Komodo 2.03DC was an exception of that rule (it even produces different moves ...
ThatsIt wrote:
IWB wrote: Are you using books or starting positions for that?
Both.
THAT is the biggest disadvantage as a test is not repeatable then! Engine A vs B is different than Engine A vs C! If the RATING is different (especially with enough game) is a complete different question.

Thx again
Ingo

Wolfgang
Posts: 313
Joined: Fri May 12, 2006 11:08 pm

Re: CEGT - rating lists March 11th 2012

Post by Wolfgang » Mon Mar 12, 2012 2:41 pm

Hi Ingo
IWB wrote: ...
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM

The i5 is the fastest, I am not sure if the AMD or the 2.3GHz 8200 are the slowest computers (I go for the Q8200).
Yes, the (my) Q8200 is "slowest". Therefore it is not used so often for this particular list. I mainly play on the X4 as it supports SSE.
My guess is that there is a difference of about 35 to 40%. If that is relevant ... I dont know.
Don't underestimate the X4. 35-40% seems a little bit too high to me, I guess it is 25-30%. But generally spoken you're right: Equal hardware would be better, but is not practicable if you have more than one tester...
I personaly like it better that way (to stick within a certain range of hardware as long as all matches are on the same hardware) than to "adapt" different speeds with a questionable benchmark!
Gerhard and me are not the biggest fans of "adapting" as you probably know...
Than you state "sse4.2 if available". As you cant run SSE on all that machines you are mixing that - unfortunately ...
I try to make all matches with SSE-engines on the X4 as currently for Stockfish 2.2.2. On the other hand I agree with Gerhard here that there is no measurable difference, maybe except Komodo, which is not yet tested.
Are you using books or starting positions for that?
As far as I know, Gerhard uses a testsuite (always the same?!) with 20 positions. I use the excellent SWCR3.5-database (~5100 positions) from Frank Quisinsky and extract testsuites (20 positions) with the fine tool "PGN-Selection 1.0" by Volker Annuss. If games are played under Arena, I use the "PGN-Random"-feature of Arena where the GUI chooses the positions which are then played with colours reversed.

For me chess (and computerchess too) does not consist of 20, 50 or 75 openings only, so I'll NEVER EVER (!!) will use the same testsuite for all my tests. But - for sure - this is a matter of taste! ;)
Anyhow, thanks for the list
Ingo
you're welcome ;)

Best
Wolfgang

Post Reply