CCRL 40/4 lists updated (11th August 2012)

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL 40/4 lists updated (11th August 2012).

Post by lkaufman »

Modern Times wrote:Hi Sedat,

Yes it is a very complex issue that is certain.

The results are what they are and it is easy to get too hung up with over-analysing them. Only with tens of thousands of games can you have any real confidence to draw firm conclusions. With reference to the AMD games, they were run by me in a meticulous manner, with a wide variety of different books and opening sets, and all GUI adjudication turned off. So I stand by those games 100%. It is kind of a sense of deja-vu for me though, because I had similarly disappointing results for Komodo 4 on the same machine.
Do I understand from this comment that Komodo 4 also did noticeably better on Intel machines than on AMD machines, just like Komodo 5?
One other question: For the 40/4 games, both CCRL and CEGT testers run the games at 40/3, even though that is a longer time control than the adjustment rules would stipulate. How about for the longer games, do all the testers play at the adjusted time limits, or are these merely considered minimums, with the actual tests run at longer levels as is done in blitz?
Modern Times
Posts: 3748
Joined: Thu Jun 07, 2012 11:02 pm

Re: CCRL 40/4 lists updated (11th August 2012).

Post by Modern Times »

lkaufman wrote: Do I understand from this comment that Komodo 4 also did noticeably better on Intel machines than on AMD machines, just like Komodo 5?
I never did the numbers, but it certainly felt that way.
lkaufman wrote: How about for the longer games, do all the testers play at the adjusted time limits,
Yes.
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: CCRL 40/4 lists updated (11th August 2012).

Post by Sedat Canbaz »

Modern Times wrote:Hi Sedat,

Yes it is a very complex issue that is certain.

The results are what they are and it is easy to get too hung up with over-analysing them. Only with tens of thousands of games can you have any real confidence to draw firm conclusions. With reference to the AMD games, they were run by me in a meticulous manner, with a wide variety of different books and opening sets, and all GUI adjudication turned off. So I stand by those games 100%. It is kind of a sense of deja-vu for me though, because I had similarly disappointing results for Komodo 4 on the same machine.

Hello Ray,

Yes..this issue is something like deja-vu for me too :)

Actually i have played thousands of games for Hardware Elo measurements,but not tens of thousands of games per player

And i guess you probably mean about 'tens of thousands of games' per player,right ?

If yes...it sounds good, but then i am afraid that we will not find a such hero

Forget tens of thousands of games per player,even there is no any rating to be based on such similar number of games per player

Plus,in case of running tens of thousands of games per player+ a small decent neutral book,then there is BIG possibility to appear many similar/double games

Another very important factor is the openings issue,and for less double/similar games we need a lot of various openings
For example,to produce tens of thousands of games per player,we should work with hundreds or maybe thousands of openings...
Then it will be no surprise,where we will see many hundreds of games per player lost due to those critical/disadvantage openings

Of course,there is no doubt that depending on hardware speed, X engine perform better or weaker
But however, the openings are another very important factor to see different standings...

Btw,please check the below table:
http://www.sedatcanbaz.com/chess/tourna ... ournament/
Note also that the current Elo difference is approx. 160 Elo (between first and last place)

Code: Select all

Rank  Participant          Elo    +    – games score oppo. draws
   1 Hitman H15a          3384   45   44   152   63%  3299   42%
  77 Jakal H15a           3225   44   45   152   38%  3301   43% 

In other words,
Even tens of thousands games per player will not give us the necessary accurate data about determining the noconfidence to draw firm conclusions
And in my opinion, the Hardware Elo measurements should be done with well-optimized openings (up to 10-12 moves)
Plus, the data (between 1.000 and 2.000 games per player) will be a quite good indicator about which processor or engine is stronger in Elo for chess


Best,
Sedat
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: CCRL 40/4 lists updated (11th August 2012)

Post by geots »

Modern Times wrote:I agree George, as I said before if SSE4 gives even 5 Elo I would be surprised


5 elo is being kind. I doubt it is more than 2, possibly in an extreme case 3elo- but that would definitely have to be "extreme".


Best,

gts
Uri Blass
Posts: 10892
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: CCRL 40/4 lists updated (11th August 2012)

Post by Uri Blass »

geots wrote:
Modern Times wrote:I agree George, as I said before if SSE4 gives even 5 Elo I would be surprised


5 elo is being kind. I doubt it is more than 2, possibly in an extreme case 3elo- but that would definitely have to be "extreme".


Best,

gts
I am sure that you do not play enough games to measure rating with accuracy of 5 elo so
I clearly not understand your confidence about the elo difference between SSE4 and not SSE4.

basically you can measure only speed difference and get an estimate based on speed difference.
2 or 3 elo difference is equivalent to 2% speed improvement so in other words if you are right komodo does not earn in average more than 2% speed from SSE4 relative to other programs.
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CCRL 40/4 lists updated (11th August 2012)

Post by lkaufman »

geots wrote:
Modern Times wrote:I agree George, as I said before if SSE4 gives even 5 Elo I would be surprised


5 elo is being kind. I doubt it is more than 2, possibly in an extreme case 3elo- but that would definitely have to be "extreme".


Best,

gts
I measured the speedup from sse4 on Komodo 5 as 10%; it was even higher on Komodo 4. My estimate of 1.3 elo for each percent is accurate with a margin or error of about 0.1. So this means that the benefit from sse4 for Komodo 5 against engines that don't use sse4 is 13 elo, plus or minus one elo.
But I assume you are talking about the net gain against other top engines that do use sse4. I think the typical speedup is 5%, and the typical elo per percent is around 1, so they should gain about 5 elo. So the net benefit for Komodo 5 should be very close to 8 elo against those engines, higher for Komodo 4. You would have to play something like 20,000 games to prove this wrong, but it would be a waste of time.
The only way this could be wrong is if some other top engine gets more than 5% from sse4. Has anyone measured this in various engines?
User avatar
geots
Posts: 4790
Joined: Sat Mar 11, 2006 12:42 am

Re: CCRL 40/4 lists updated (11th August 2012)

Post by geots »

Uri Blass wrote:
geots wrote:
Modern Times wrote:I agree George, as I said before if SSE4 gives even 5 Elo I would be surprised


5 elo is being kind. I doubt it is more than 2, possibly in an extreme case 3elo- but that would definitely have to be "extreme".


Best,

gts
I am sure that you do not play enough games to measure rating with accuracy of 5 elo so
I clearly not understand your confidence about the elo difference between SSE4 and not SSE4.

basically you can measure only speed difference and get an estimate based on speed difference.
2 or 3 elo difference is equivalent to 2% speed improvement so in other words if you are right komodo does not earn in average more than 2% speed from SSE4 relative to other programs.


Bullshit Uri- the speed increase doesn't auto show up as the same % of strength increase. There are plenty of engines running at extremely fast speeds that couldn't find their ass with a mirror on a stick.


gts