CEGT - rating lists August 12th 2012

Discussion of computer chess matches and engine tournaments.

Moderator: Ras

lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CEGT - rating lists August 12th 2012

Post by lkaufman »

ThatsIt wrote:Hi Larry !
lkaufman wrote: [..snip...]
Can you estimate what percentage of the games use SSE4?
[...snip...]
All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.

Best wishes,
G.S.
(CEGT member)
Maybe I misunderstood, but I thought a previous post mentioned using the Q6600, which I believe is too old to have sse4. What is my mistake?
ThatsIt
Posts: 992
Joined: Thu Mar 09, 2006 2:11 pm

Re: CEGT - rating lists August 12th 2012

Post by ThatsIt »

lkaufman wrote:
ThatsIt wrote:Hi Larry !
lkaufman wrote: [..snip...]
Can you estimate what percentage of the games use SSE4?
[...snip...]
All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.

Best wishes,
G.S.
(CEGT member)
Maybe I misunderstood, but I thought a previous post mentioned using the Q6600, which I believe is too old to have sse4. What is my mistake?
If an engine is ready for SSE4, we try to use only the
SSE4 hardware for the tests. Wolfgang used his AMD X-4's
and i the Intel i5-2400 for the Komodo 5 x64 tests.
Best wishes,
G.S.
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CEGT - rating lists August 12th 2012

Post by lkaufman »

ThatsIt wrote:
lkaufman wrote:
ThatsIt wrote:Hi Larry !
lkaufman wrote: [..snip...]
Can you estimate what percentage of the games use SSE4?
[...snip...]
All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.

Best wishes,
G.S.
(CEGT member)
Maybe I misunderstood, but I thought a previous post mentioned using the Q6600, which I believe is too old to have sse4. What is my mistake?
If an engine is ready for SSE4, we try to use only the
SSE4 hardware for the tests. Wolfgang used his AMD X-4's
and i the Intel i5-2400 for the Komodo 5 x64 tests.
Best wishes,
G.S.
I see, thanks. I want to mention that it is not only Komodo where our data do not agree very well with either CCRL or CEGT. We get lower ratings for Stockfish and higher ratings for Critter (relative to Houdini) than CCRL and CEGT, and we have 10,000 game samples for most engines at three different levels. Most likely it is due to the fact that we do increment testing rather than repeating time controls, but I don't think this is the whole story, as we also observe the same discrepancy with IPON, which does use increment testing. I guess it will take some time to learn what is the cause of these discrepancies, which are a bit too large to blame on sample error.
User avatar
Dan Honeycutt
Posts: 5258
Joined: Mon Feb 27, 2006 4:31 pm
Location: Atlanta, Georgia

Re: CEGT - rating lists August 12th 2012

Post by Dan Honeycutt »

lkaufman wrote:I guess it will take some time to learn what is the cause of these discrepancies, which are a bit too large to blame on sample error.
Do your book or starting positions differ?

Best
Dan H.
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CEGT - rating lists August 12th 2012

Post by lkaufman »

Dan Honeycutt wrote:
lkaufman wrote:I guess it will take some time to learn what is the cause of these discrepancies, which are a bit too large to blame on sample error.
Do your book or starting positions differ?

Best
Dan H.
Well it would have to, since the different testing organizations and even different testers within them use different books. Our book now is of highly variable depth and includes positions frequently seen in human tournament play. I don't know whether that is a fair description of the most popular books in use by the testing groups. This could account for a few elo discrepancy, I think.
ThatsIt
Posts: 992
Joined: Thu Mar 09, 2006 2:11 pm

Re: CEGT - rating lists August 12th 2012

Post by ThatsIt »

lkaufman wrote: Well it would have to, since the different testing organizations and even different testers within them use different books. Our book now is of highly variable depth and includes positions frequently seen in human tournament play. I don't know whether that is a fair description of the most popular books in use by the testing groups. This could account for a few elo discrepancy, I think.
Perhaps you are testing against too little different opponents?
Best wishes,
G.S.
lkaufman
Posts: 6258
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA
Full name: Larry Kaufman

Re: CEGT - rating lists August 12th 2012

Post by lkaufman »

ThatsIt wrote:
lkaufman wrote: Well it would have to, since the different testing organizations and even different testers within them use different books. Our book now is of highly variable depth and includes positions frequently seen in human tournament play. I don't know whether that is a fair description of the most popular books in use by the testing groups. This could account for a few elo discrepancy, I think.
Perhaps you are testing against too little different opponents?
Best wishes,
G.S.
That's true, we only use Houdini 1.5, Critter, and Stockfish on the distributed test as we are limited to free engines for this and don't think it's worthwhile to test against engines that are more than 150 elo below us. But would your results be much different if you limit opponents to STockfish level and above? Of course then your sample size would be too small.