60 games Komodo 5 against Top4 at 120m+3s

Discussion of computer chess matches and engine tournaments.

Moderators: hgm, Rebel, chrisw

Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Uri Blass »

Laskos wrote:
lkaufman wrote:
Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.
Then, with 95% confidence using Timo games, scales back badly at very long 120/40 time control.

Kai
I think that timo used 120+3 and not 120/40

The CCRL rating suggests that it scales back badly but
I am not sure if it is not a statistical error and we certainly need more games.
lkaufman
Posts: 5960
Joined: Sun Jan 10, 2010 6:15 am
Location: Maryland USA

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by lkaufman »

Laskos wrote:
lkaufman wrote:
Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.
Then, with 95% confidence using Timo games, scales back badly at very long 120/40 time control.

Kai
It would be almost impossible to design a chess program that scaled better than another from 40/4 up to 40/20 and then worse up to 40/120. The right way to look at this, due to the small samples, is just to pool all the games at 40/20, 40/40, and 40/120 and rate them, then compare the gap from Houdini 2 with the gap in the blitz lists. I expect it will show that we are much closer to Houdini 2 in the longer games.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Uri Blass »

lkaufman wrote:
Laskos wrote:
lkaufman wrote:
Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.
Then, with 95% confidence using Timo games, scales back badly at very long 120/40 time control.

Kai
It would be almost impossible to design a chess program that scaled better than another from 40/4 up to 40/20 and then worse up to 40/120. The right way to look at this, due to the small samples, is just to pool all the games at 40/20, 40/40, and 40/120 and rate them, then compare the gap from Houdini 2 with the gap in the blitz lists. I expect it will show that we are much closer to Houdini 2 in the longer games.
I do not say that it is the case but I do not agree that
it is almost impossible.

It is easy to have some bug that cause the program to play weaker only at longer time control.

There can be stack overflow when the program
try to do something like A[n]=m when the size of the array A is too small and it is possible that the problem never happens at blitz (so the program scales well from bullet to blitz) but happens at long time control.

Note that stack overflow not always cause the program to crash and it can cause the program only to perform worse because A[n]=m change something that does not cause the program to crash.

There may be other bugs that do not cause stack overflow but still cause the program to play weaker at long time control.
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Uri Blass »

thinking about it again it may happen also without bugs or only with chess related bugs.

Imagine that we use a very slow hardware and
a program that use selective search competes against
a program that use only brute force(both programs use the alpha beta algorithm with no pruning except the pruning of the selective search)

The selective search program may win at blitz because it goes deeper but at slower time control it is going to lose because of the things that the selective search miss(the selective search program may even scale better if you go from bullet to blitz).

Programs of today also may use some selective search because of a bug when the selective search is not a problem at fast time control(even with the big depths that the programs get today) because the selective search prunes very little but it starts to be a bigger problem at longer time control.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Houdini »

Uri Blass wrote:thinking about it again it may happen also without bugs or only with chess related bugs.
The only established "bug" is Larry's cherry-picking of rating lists, and the Komodo Team's tendency to make extrapolations without actually playing any games against Houdini 2.

The fact is that there is not a single rating list where Komodo is ahead of Houdini. Looking at the current evidence (IPON 5+3, CEGT 20/40, CCRL 40/40, Timo's 120+3) it seems that Komodo 5 is still about 20 Elo behind Houdini 2.
The results at 40/40 or 120+3 are not very supportive of the hypothesis that Komodo 5 would scale any better than Houdini 2.

Robert
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Uri Blass »

Houdini wrote:
Uri Blass wrote:thinking about it again it may happen also without bugs or only with chess related bugs.
The only established "bug" is Larry's cherry-picking of rating lists, and the Komodo Team's tendency to make extrapolations without actually playing any games against Houdini 2.

The fact is that there is not a single rating list where Komodo is ahead of Houdini. Looking at the current evidence (IPON 5+3, CEGT 20/40, CCRL 40/40, Timo's 120+3) it seems that Komodo 5 is still about 20 Elo behind Houdini 2.
The results at 40/40 or 120+3 are not very supportive of the hypothesis that Komodo 5 would scale any better than Houdini 2.

Robert
The main problem is not houdini.
The main problem is that it seems that komodo5 scales worse than previous versions of komodo(based on comparing between CCRL 40/40 and CCRL 40/4).

Maybe it is going to be changed with more games and we certainly need more games to be sure.
User avatar
Houdini
Posts: 1471
Joined: Tue Mar 16, 2010 12:00 am

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Houdini »

The main problem is people making claims about scaling based on scanty evidence.

Robert
TimoK
Posts: 98
Joined: Sun Jan 03, 2010 12:28 pm
Location: Hamburg

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by TimoK »

Hi all,

thx for your warm words concerning this little test! I'm happy that it was interesting for you, too. I also enjoyed the games and will continue my testing with future engines.

I just uploaded all games to my webpage, here is the link:
http://team-oh.de/Computerschach/files/Komodo5.zip

Best regards
Timo
beram
Posts: 1187
Joined: Wed Jan 06, 2010 3:11 pm

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by beram »

Uri Blass wrote:
Houdini wrote:
Uri Blass wrote:thinking about it again it may happen also without bugs or only with chess related bugs.
The only established "bug" is Larry's cherry-picking of rating lists, and the Komodo Team's tendency to make extrapolations without actually playing any games against Houdini 2.

The fact is that there is not a single rating list where Komodo is ahead of Houdini. Looking at the current evidence (IPON 5+3, CEGT 20/40, CCRL 40/40, Timo's 120+3) it seems that Komodo 5 is still about 20 Elo behind Houdini 2.
The results at 40/40 or 120+3 are not very supportive of the hypothesis that Komodo 5 would scale any better than Houdini 2.

Robert
Well said Robert, I fully agree on that
Alas Larry seems very stubborn on this and probably wont give up banging his single tone drum
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Laskos »

Uri Blass wrote:
Laskos wrote:
lkaufman wrote:
Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.
Then, with 95% confidence using Timo games, scales back badly at very long 120/40 time control.

Kai
I think that timo used 120+3 and not 120/40

The CCRL rating suggests that it scales back badly but
I am not sure if it is not a statistical error and we certainly need more games.
That is what I said, 95% confidence (not 100%) from Timo games, and a similar confidence from CCRL games, but at a shorter control. CEGT games 40/20 are even faster and on a weaker hardware than Timo's. Would be nice if Timo completes the round-robin of all engines, means adding Komodo 5 and Critter 1.6 matches, completing the previous round-robin.

Kai