CEGT - rating lists August 12th 2012

lkaufman · Post by **lkaufman** » Tue Aug 14, 2012 4:47 am

carldaman wrote:Larry, I'd also be curious as to the GUI used for testing. As I noted in another thread, Komodo5 using the default drawscore of -7 suffers when tested on the FRitz GUI. I had to set the drawscore to 0 to ensure that Komodo did not make obviously weak moves.

Regards,
Carl

Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position? It seems to me that using the negative drawscore should make the engine play more actively to avoid a repetition, unless somehow the interface reversed the sign, which seems unlikely. So it's quite puzzling.

carldaman · Post by **carldaman** » Tue Aug 14, 2012 5:22 am

Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position?

Repeatable? Most certainly. I'm positive about that, as I've run multiple tests to rule out any freak occurrence.

I can't comment on Fritz 11, as I currently only have Fritz 12 installed. It would be nice if someone could try to replicate this behavior using Fritz 11-12-13, to see if there is any difference.

I've also observed that Komodo4 does not exhibit the strange behavior when tested with Fritz 12 GUI. Its default drawscore is -5.

What's also interesting is that K5's default drawscore is not an issue for the other GUI's I normally test with (Arena, Winboard, ChessGUI). The bad/strange moves only manifested themselves within the Fritz 12 GUI environment. Setting the drawscore to zero did get rid of the problem, however, which was a relief. Such a step was not necessary for the other GUIs mentioned.

CL

lkaufman · Post by **lkaufman** » Tue Aug 14, 2012 5:36 am

carldaman wrote:Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position?

Repeatable? Most certainly. I'm positive about that, as I've run multiple tests to rule out any freak occurrence.

I can't comment on Fritz 11, as I currently only have Fritz 12 installed. It would be nice if someone could try to replicate this behavior using Fritz 11-12-13, to see if there is any difference.

I've also observed that Komodo4 does not exhibit the strange behavior when tested with Fritz 12 GUI. Its default drawscore is -5.

What's also interesting is that K5's default drawscore is not an issue for the other GUI's I normally test with (Arena, Winboard, ChessGUI). The bad/strange moves only manifested themselves within the Fritz 12 GUI environment. Setting the drawscore to zero did get rid of the problem, however, which was a relief. Such a step was not necessary for the other GUIs mentioned.

CL

I'm running an overnight match using the Fritz 11 gui to see if the results are way out of line with what we get on our own tester.

Werner · Post by **Werner** » Tue Aug 14, 2012 9:05 am

lkaufman wrote:Question: What is the actual hardware and actual time limit used for most of the 40/4 games now? What time limit on some brand new Intel computer, say at 3 GHz, would be closest to what you actually use on average? We're trying to find out why our results for Komodo are consistently better at blitz than those reported by both CCRL and CEGT, even with our opening book modified to me more typical of others. Also, is it possible to see whether the ratings of the top few single-core engines would be much different if only pairings among them were rated?

We appreciate all your hard work.

Thanks,
Larry for Komodo

Hi Larry,
sorry - Wolfgang is in its holidays, perhaps Gerhard too. I think Wolfgang posted his hardware a few months ago here.
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM
maybe these are his pcs but I am not sure - and he runs games 40/3 not faster times. And I am sure he uses CB GUI only for playing games with Fritz engine. Both use Shredder GUI and Arena.

lkaufman · Post by **lkaufman** » Tue Aug 14, 2012 4:28 pm

Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.

MM · Post by MM » Tue Aug 14, 2012 7:19 pm

lkaufman wrote:Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.

Hi Larry, you said (about the book) ''more like yours''.
I think that it could be that ''more like yours'' and not ''identical'' could mean several elo.

As you know some engines are sensible to some kinds of positions.

I can prepare a book with the same number of plies (on average) of ccrl and cegt but it doesn't guarantee anything.

IMO what really matters is the kind of position that runs after the end of the book. I'm sure you understand what i mean.

Best Regards

lkaufman · Post by **lkaufman** » Tue Aug 14, 2012 8:29 pm

MM wrote:
lkaufman wrote:Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.
Hi Larry, you said (about the book) ''more like yours''.
I think that it could be that ''more like yours'' and not ''identical'' could mean several elo.

As you know some engines are sensible to some kinds of positions.

I can prepare a book with the same number of plies (on average) of ccrl and cegt but it doesn't guarantee anything.

IMO what really matters is the kind of position that runs after the end of the book. I'm sure you understand what i mean.

Best Regards

Yes, but I assumed (perhaps wrongly) that most books used would choose a cross section of openings typically seen in human play.

lkaufman · Post by **lkaufman** » Tue Aug 14, 2012 8:33 pm

Werner wrote:
lkaufman wrote:Question: What is the actual hardware and actual time limit used for most of the 40/4 games now? What time limit on some brand new Intel computer, say at 3 GHz, would be closest to what you actually use on average? We're trying to find out why our results for Komodo are consistently better at blitz than those reported by both CCRL and CEGT, even with our opening book modified to me more typical of others. Also, is it possible to see whether the ratings of the top few single-core engines would be much different if only pairings among them were rated?

We appreciate all your hard work.

Thanks,
Larry for Komodo
Hi Larry,
sorry - Wolfgang is in its holidays, perhaps Gerhard too. I think Wolfgang posted his hardware a few months ago here.
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM
maybe these are his pcs but I am not sure - and he runs games 40/3 not faster times. And I am sure he uses CB GUI only for playing games with Fritz engine. Both use Shredder GUI and Arena.

At least the Q-6600 does not use SSE4 I believe, I'm not sure about the others. Can you estimate what percentage of the games use SSE4? This is starting to look like the main culprit, in CCRL as well.
Also, are the longer time control (non-blitz) games played on the same machines, or do they perhaps all use SSE4?

ThatsIt · Post by **ThatsIt** » Wed Aug 15, 2012 9:14 am

Hi Larry !

lkaufman wrote: [..snip...]
Can you estimate what percentage of the games use SSE4?
[...snip...]

All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.

Best wishes,
G.S.
(CEGT member)

Modern Times · Post by **Modern Times** » Wed Aug 15, 2012 10:48 am

All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.

Best wishes,
G.S.
(CEGT member)

Well now, that blows Larry's theory out of the water.

CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012

Re: CEGT - rating lists August 12th 2012