Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position? It seems to me that using the negative drawscore should make the engine play more actively to avoid a repetition, unless somehow the interface reversed the sign, which seems unlikely. So it's quite puzzling.carldaman wrote:Larry, I'd also be curious as to the GUI used for testing. As I noted in another thread, Komodo5 using the default drawscore of -7 suffers when tested on the FRitz GUI. I had to set the drawscore to 0 to ensure that Komodo did not make obviously weak moves.
Regards,
Carl
CEGT - rating lists August 12th 2012
Moderator: Ras
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: CEGT - rating lists August 12th 2012
-
- Posts: 2287
- Joined: Sat Jun 02, 2012 2:13 am
Re: CEGT - rating lists August 12th 2012
Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position?
Repeatable? Most certainly. I'm positive about that, as I've run multiple tests to rule out any freak occurrence.
I can't comment on Fritz 11, as I currently only have Fritz 12 installed. It would be nice if someone could try to replicate this behavior using Fritz 11-12-13, to see if there is any difference.
I've also observed that Komodo4 does not exhibit the strange behavior when tested with Fritz 12 GUI. Its default drawscore is -5.
What's also interesting is that K5's default drawscore is not an issue for the other GUI's I normally test with (Arena, Winboard, ChessGUI). The bad/strange moves only manifested themselves within the Fritz 12 GUI environment. Setting the drawscore to zero did get rid of the problem, however, which was a relief. Such a step was not necessary for the other GUIs mentioned.
CL
Repeatable? Most certainly. I'm positive about that, as I've run multiple tests to rule out any freak occurrence.
I can't comment on Fritz 11, as I currently only have Fritz 12 installed. It would be nice if someone could try to replicate this behavior using Fritz 11-12-13, to see if there is any difference.
I've also observed that Komodo4 does not exhibit the strange behavior when tested with Fritz 12 GUI. Its default drawscore is -5.
What's also interesting is that K5's default drawscore is not an issue for the other GUI's I normally test with (Arena, Winboard, ChessGUI). The bad/strange moves only manifested themselves within the Fritz 12 GUI environment. Setting the drawscore to zero did get rid of the problem, however, which was a relief. Such a step was not necessary for the other GUIs mentioned.
CL
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: CEGT - rating lists August 12th 2012
I'm running an overnight match using the Fritz 11 gui to see if the results are way out of line with what we get on our own tester.carldaman wrote:Do you happen to know whether the odd behavior applies to Fritz 11 gui as well as Fritz 12 gui? Also, is it repeatable, can you get the suspicious moves to be played every time from the given position?
Repeatable? Most certainly. I'm positive about that, as I've run multiple tests to rule out any freak occurrence.
I can't comment on Fritz 11, as I currently only have Fritz 12 installed. It would be nice if someone could try to replicate this behavior using Fritz 11-12-13, to see if there is any difference.
I've also observed that Komodo4 does not exhibit the strange behavior when tested with Fritz 12 GUI. Its default drawscore is -5.
What's also interesting is that K5's default drawscore is not an issue for the other GUI's I normally test with (Arena, Winboard, ChessGUI). The bad/strange moves only manifested themselves within the Fritz 12 GUI environment. Setting the drawscore to zero did get rid of the problem, however, which was a relief. Such a step was not necessary for the other GUIs mentioned.
CL
-
- Posts: 2993
- Joined: Wed Mar 08, 2006 10:09 pm
- Location: Germany
- Full name: Werner Schüle
Re: CEGT - rating lists August 12th 2012
Hi Larry,lkaufman wrote:Question: What is the actual hardware and actual time limit used for most of the 40/4 games now? What time limit on some brand new Intel computer, say at 3 GHz, would be closest to what you actually use on average? We're trying to find out why our results for Komodo are consistently better at blitz than those reported by both CCRL and CEGT, even with our opening book modified to me more typical of others. Also, is it possible to see whether the ratings of the top few single-core engines would be much different if only pairings among them were rated?
We appreciate all your hard work.
Thanks,
Larry for Komodo
sorry - Wolfgang is in its holidays, perhaps Gerhard too. I think Wolfgang posted his hardware a few months ago here.
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM
maybe these are his pcs but I am not sure - and he runs games 40/3 not faster times. And I am sure he uses CB GUI only for playing games with Fritz engine. Both use Shredder GUI and Arena.
Werner
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: CEGT - rating lists August 12th 2012
Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.
-
- Posts: 766
- Joined: Sun Oct 16, 2011 11:25 am
Re: CEGT - rating lists August 12th 2012
Hi Larry, you said (about the book) ''more like yours''.lkaufman wrote:Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.
I think that it could be that ''more like yours'' and not ''identical'' could mean several elo.
As you know some engines are sensible to some kinds of positions.
I can prepare a book with the same number of plies (on average) of ccrl and cegt but it doesn't guarantee anything.
IMO what really matters is the kind of position that runs after the end of the book. I'm sure you understand what i mean.
Best Regards
MM
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: CEGT - rating lists August 12th 2012
Yes, but I assumed (perhaps wrongly) that most books used would choose a cross section of openings typically seen in human play.MM wrote:Hi Larry, you said (about the book) ''more like yours''.lkaufman wrote:Thanks. My overnight results did not confirm any problem with Fritz 11 gui (didn't try 12), and anyway you don't usually use it. There is about a 20 elo gap between CEGT/CCRL blitz results and our own (in terms of relative rating of Houdini and Komodo), and I'm running out of theories to explain it. We use increment rather than repeating time controls, so this could be a contributing factor, but we haven't found that our results are noticeably worse when we do try repeating time controls. We don't use TBs in our tests, but it is widely reported that they don't help elo. Our current test level is probably roughly equivalent to yours. Our books were modified to include longer lines and should be more like yours. I wonder what could account for the 20 elo? There are too many games to attribute it to sample error.
I think that it could be that ''more like yours'' and not ''identical'' could mean several elo.
As you know some engines are sensible to some kinds of positions.
I can prepare a book with the same number of plies (on average) of ccrl and cegt but it doesn't guarantee anything.
IMO what really matters is the kind of position that runs after the end of the book. I'm sure you understand what i mean.
Best Regards
-
- Posts: 6258
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: CEGT - rating lists August 12th 2012
At least the Q-6600 does not use SSE4 I believe, I'm not sure about the others. Can you estimate what percentage of the games use SSE4? This is starting to look like the main culprit, in CCRL as well.Werner wrote:Hi Larry,lkaufman wrote:Question: What is the actual hardware and actual time limit used for most of the 40/4 games now? What time limit on some brand new Intel computer, say at 3 GHz, would be closest to what you actually use on average? We're trying to find out why our results for Komodo are consistently better at blitz than those reported by both CCRL and CEGT, even with our opening book modified to me more typical of others. Also, is it possible to see whether the ratings of the top few single-core engines would be much different if only pairings among them were rated?
We appreciate all your hard work.
Thanks,
Larry for Komodo
sorry - Wolfgang is in its holidays, perhaps Gerhard too. I think Wolfgang posted his hardware a few months ago here.
Intel i5-2400 @3.10GHz / 4GB RAM
Intel Q-6600 @2.60GHz / 4GB RAM
Intel Q-8200 @2.33GHz / 4GB RAM
AMD X-4 @3.00GHz / 6GB RAM
maybe these are his pcs but I am not sure - and he runs games 40/3 not faster times. And I am sure he uses CB GUI only for playing games with Fritz engine. Both use Shredder GUI and Arena.
Also, are the longer time control (non-blitz) games played on the same machines, or do they perhaps all use SSE4?
-
- Posts: 992
- Joined: Thu Mar 09, 2006 2:11 pm
Re: CEGT - rating lists August 12th 2012
Hi Larry !
Best wishes,
G.S.
(CEGT member)
All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.lkaufman wrote: [..snip...]
Can you estimate what percentage of the games use SSE4?
[...snip...]
Best wishes,
G.S.
(CEGT member)
-
- Posts: 3748
- Joined: Thu Jun 07, 2012 11:02 pm
Re: CEGT - rating lists August 12th 2012
Well now, that blows Larry's theory out of the water.All the Komodo 5 x64 games for the CEGT 40/4 were played by using SSE4.
Best wishes,
G.S.
(CEGT member)