60 games Komodo 5 against Top4 at 120m+3s

Discussion of computer chess matches and engine tournaments.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Modern Times
Posts: 2568
Joined: Thu Jun 07, 2012 9:02 pm

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Modern Times » Sat Jul 28, 2012 1:12 pm

TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.

User avatar
Houdini
Posts: 1471
Joined: Mon Mar 15, 2010 11:00 pm
Contact:

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Houdini » Sat Jul 28, 2012 5:42 pm

Timo, thanks a lot for running the matches.
Some very interesting games, it was fun to watch.

Robert

Uri Blass
Posts: 8771
Joined: Wed Mar 08, 2006 11:37 pm
Location: Tel-Aviv Israel

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Uri Blass » Sat Jul 28, 2012 6:43 pm

Sedat Canbaz wrote:
carldaman wrote:Hi Sedat,

If I may make a friendly suggestion -- at some point you may want to put together a book that has more variety, but at the expense of being less perfect, where some lines such as 1.e4 e6 2.d4 d5 3.Nc3 Bb4 and others that we know from general opening theory to be playable (for ex. the Philidor, Old Indian) can be included. Of course, you would still exclude lines that are outright busted/unplayable. You can call it a Variety book if you like.

That way testers can test with your Perfect book, and then test with the expanded Variety book as well, for better comparison.

Thank you for your very good work!
Carl
Dear Carl,

Not at all...

Thanks too for your useful comments

Yes...your idea sounds not bad,actually i was considering about releasing such varied neutral short book

But later i've changed my mind and do you know why,see below please

If we will start testing the engines with such various openings,then the game results will be look like more interesting,more exiting...
But then i am afraid that we will need to run minimum 3.000 - 5.000 games per player

For example,1000 games (per player) are enough data to show the real strength,if the engines are using Perfect 2012b version

Plus in case of releasing a such varied book,many engines Elo standings will be effected

Probably such varied book will be a good idea,if we start completely new testing or rating list (with new conditions)

Of course,i agree with you that those engines,which can play this type of disadvantage positions should be rewarded for it

But unfortunately,there are some opening positions, where mostly of the engines are falling in real trouble

And i think its a mistake and injustice,if we will allow the engines to be tested in such critical positions

Note also that almost all Top Human Players or Top Book Makers dont prefer various openings,each of them has own favorite lines
I think the answer is no so hard to know about why they prefer not varied openings...


Best,
Sedat

I am not sure what do you mean by the real strength of the engines.
You can show only an estimate for the rating of the engines in specific conditions(rating in specific conditions is not real strength and if you change the opening the order of the engines may be different).

Maybe the possible error is going to be higher if you include some
openings but I do not think that it is a good reason not to include the relevant openings.

My opinion is that
the right solution to reduce the error in the estimate
is simply to play from fixed positions when all engines play the same number of games against every opponent with the same openings(white or black).

I see no reason not to include the Bb4 line of the french.
Maybe games from this line are not going to add much because in most cases white wins but I do not believe that you are going to need 3000-5000 games for the same statistical error that you get today by 1000 games.

Maybe you are going to need 1100 games for the same statistical error because most opening are not opening of the same type of the Bb4 french
and even if you play the Bb4 french the demage is limited by the fact that every engine play it the same percentage of times against other and every engine play it with both colors.

TimoK
Posts: 97
Joined: Sun Jan 03, 2010 11:28 am
Location: Hamburg

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by TimoK » Sat Jul 28, 2012 10:00 pm

Houdini wrote:Timo, thanks a lot for running the matches.
Some very interesting games, it was fun to watch.

Robert
You're welcome, Robert. It was also a pleasure for me watching these games. Komodo and Houdini played two of the most beautiful games I've ever seen in this match series.

Looking forward to test Houdini 3 in Sept/Oct!

Best regards
Timo

TimoK
Posts: 97
Joined: Sun Jan 03, 2010 11:28 am
Location: Hamburg

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by TimoK » Sat Jul 28, 2012 10:07 pm

Modern Times wrote:Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
That's interesting indeed. Maybe that is already more than just an indication for the hypothesis "K5 doesn't scale as good as K4 when playing games with long TCs". Maybe Don and Larry should check if there's some kind of problem with K5, since K4 behaves just the other way around (scales good with long TCs).

Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 10:58 am
Location: Antalya/Turkey
Contact:

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Sedat Canbaz » Sat Jul 28, 2012 10:17 pm

Uri Blass wrote: I am not sure what do you mean by the real strength of the engines.
I mean about 'the real strength' of the engines,regarding for the current SCCT conditions of course
Not sure what will the results on my old Pentium 2.40GHz or on faster machines than mine
And i will be not surprised too that we can see completely different results in other conditions
I mean,if we use other openings (Non-Perfect books) or if we use other time controls...
Uri Blass wrote: You can show only an estimate for the rating of the engines in specific conditions
(rating in specific conditions is not real strength and if you change the opening the order of the engines may be different).

I dont think that SCCT uses specific conditions...
Even i can say,i use quite usual conditions....not too bad method for Blitz
For example,SCCT used openings are not so deep and not so weak,the used hardwares are not too slow vs...
Normally I’m not a big fan of testing the engines with Blitz time controls
But however,the current used SCCT time control (3m+2s) is not too bad too
Btw,the latest World Blitz Chess Championships's time control is same as SCCT

Please see for more details about SCCT's advantages and disadvantages:
http://www.sedatcanbaz.com/chess/2838-2/
Uri Blass wrote: Maybe the possible error is going to be higher if you include some
openings but I do not think that it is a good reason not to include the relevant openings.
I dont think that i can call for those disadvantages openings = relevant openings
It is more looks like those your relevant openings seem to be 'critical openings',where the Engines performance suffer a lot
Sorry...i will prefer to allow the stronger line than the weaker one

One thing more,
How do you know or sure that 3..Bb4 is strong move and this line should be allowed for playing engine vs engine matches?
Do you have any data-games played by top chess engines on latest fast hardwares ???
Note also that i dont say the current French line (3...Bb4) is weak and i dont say too that this line should not be allowed in any chess games
And as far as i remember,other different deep books (with Blacks) have more than 50 % winning percentage,e.g with the current French line
But those superior deep book moves are appearing not up to 8 moves,e.g in most cases those strong lines are appearing after 20 moves depth
And i strongly believe in this:
-Its not very good idea,if we will start to test the engines with a such deep book moves
In my opinion,the well-optimized openings (up 8 or 10 moves) are quite good idea...
I mean,if we really want to know the real strength of the engines


Btw,now we are in 2012,
Its time to change our minds...we dont test humans,we are just testing the chess engine strength
SCCT Rating List is a serious chess engine competition (not a beta testing)
To be honest,i dont care much about the performance of Humans winning percentage
I am just concentrate on engine winning line percentage...no more no less
Its a pity that still there are people who believe that the top engines should be tested in critical positions



Best,
Sedat

beram
Posts: 1187
Joined: Wed Jan 06, 2010 2:11 pm

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by beram » Sun Jul 29, 2012 7:54 am

Thx Timo, for your interesting good work
Looking forward to see the Houdini 3 in your future tournaments

btw is there a download link to the games or website link ?

grts Bram

Jeroen
Posts: 501
Joined: Wed Mar 08, 2006 8:49 pm

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Jeroen » Sun Jul 29, 2012 7:59 am

Hi Timo,

Thanks for these very interesting matches! I have replayed all of them and I saw a lot of superb, hard fought games, that were very exciting to watch.

Kind regards, Jeroen

lkaufman
Posts: 4281
Joined: Sun Jan 10, 2010 5:15 am
Location: Maryland USA
Contact:

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by lkaufman » Sun Jul 29, 2012 12:40 pm

Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.

User avatar
Laskos
Posts: 10312
Joined: Wed Jul 26, 2006 8:21 pm
Full name: Kai Laskos

Re: 60 games Komodo 5 against Top4 at 120m+3s

Post by Laskos » Sun Jul 29, 2012 12:51 pm

lkaufman wrote:
Modern Times wrote:
TimoK wrote:b) Komodo 5 shows no improvement over Komodo 4 for the used test conditions, i.e. long TCs with low increment, AMD CPU, strong opponents, Noomen Opening Suite 2012.

Best regards
Timo
Well, after 500 games on CCRL 40/40, Komodo 5 shows just a 3 Elo improvement over Komodo 4. 400 of the 500 games are AMD SSE4. So that is another set of reasonably long time control results (on AMD) that do not favour Komodo 5. I'm inclined to disregard the AMD factor, which just leaves longer time controls where K5 does not shine over K4.
After 1000 games on CEGT at 40/20, Komodo 5 is 2 elo ahead of Houdini 2 and 24 elo ahead of Komodo 4. It just shows you need big samples to measure modest gains. Considering that we are still not close to Houdini 2 at 40/4, it shows that Komodo 5 does scale better than Houdini 2, at least up to 40/20.
Then, with 95% confidence using Timo games, scales back badly at very long 120/40 time control.

Kai

Post Reply