Number 1 engine on long time controls

Don · Post by **Don** » Tue Feb 28, 2012 1:30 am

That is very interesting stuff Jose!

Based on this idea and what Uri also suggested (which I think are the same basic concept) I started a new study - one that will probably take weeks to complete because it's being run on a slow quad laptop - a spare machine that I do not use much.

The new study combines both program from levels 00 to 08 and rates them all together in a massive round robin. The result will be plotted on a graph and I can handicap Houdini as Uri suggests by adjusting the x-axis, which is the equivalent of handicapping one of the programs by time. So if I add 0.5 to Houdini's X-AXIS for example it's like handicapping it by half a doubling or the same as giving Komodo 1.414 times more time.

The idea is to make an adjustment that causes the lines to intersect at some arbitrary point near the center. The intersection point will define the ELO rating for which they are are equivalent given the specified handicap.

It looks like I will only get 20 or 30 games per day per player (18 players) so it will be a few days before there is even enough data to produce relatively stable lines and a few weeks for the lines to really be very precise.

Ajedrecista wrote:Hello:
Milos wrote:
Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
Finally someone who understands and doesn't write the usual nonsense.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)
Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double.                                                                                                                                                 
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                               
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5                     
 Level 06 -   +14.5                                                            
 Leval 07 -   +2.0                                                                              
 Level 08 -   -10.5                                                                        
 Level 09 -   -23.0                                                                              
 Level 10 -   -35.5                                                       
At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).

So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...
I have seen that Milos has done an extrapolation and I also want to post my own one. With the data provided by Don:
Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double.                                              
                                                                                                            
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5
I have done an adjust by least squares by hand, with the only help of a Casio calculator (I hope no errors in my calculations). I explain what I have done a little more:
Code: Select all
x axis ---> the level (time control).
y axis ---> Houdini Elo advantage over Komodo.
I did a rescale in the x axis in the following way: x axis is in logarithmic scale (basis e), and I add 1 to each level for avoid ln(0). So, when I put ln(1), it means Don's level 0, and so on; level 10 pointed out by Milos is ln(11) in my x axis. I calculated the least squares by hand with a system of equations, in this way:
Code: Select all
Y(x) = m·ln(x) + n
Y numbers go to 'HOUDINI' column.

AX = B

Symmetric matrix of size 2x2: A = [N, SUM(x_i); SUM(x_i), SUM((x_i)²)]
Vector of size 2x1: X = [n, m]
Vector of size 2x1: B = [SUM(y_i), SUM((x_i)·(y_i))]

N is the number of (x, y) known data; here: N = 6.
SUM(x_i) = ln(1) + ln(2) + ... + ln(6) ~ 6.5793
SUM((x_i)²) =[ln(1)]² + [ln(2)]² + ... + [ln(6)]² ~ 9.4099
SUM(y_i) = 143.3 + 97 + ... + 27 = 434.2
SUM(x_i·y_i) = ln(1)·143.3 + ln(2)·97 + ... + ln(6)·27 ~ 334.3384

Det.(A) = N·SUM((x_i)²) - [SUM(x_i)]² ~ 13.1729
Solving by Cramer, I got (more less, roundings included):
Code: Select all
n ~ 143.1793
m ~ -64.5781
So, renaming the level names to their original names (from 0 to 10 instead from 1 to 11) and adding a couple of levels, this is what I get:
Code: Select all
OWN ADJUST:
===========

Level where 00 is 6 + 0.1 and each successive level is double.                                              
                                                                                                            
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                
 Level 00 -  +143.2
 Leval 01 -   +98.4   +44.8
 Leval 02 -   +72.2   +26.2
 Level 03 -   +53.7   +18.5
 Level 04 -   +39.2   +14.5
 Level 05 -   +27.5   +11.7
 Level 06 -   +17.5   +10
 Level 07 -    +8.9    +8.6
 Level 08 -    +1.3    +7.6
 Level 09 -    -5.5    +6.8
 Level 10 -   -11.7    +6.2
 Level 11 -   -17.3    +5.6
 Level 12 -   -22.5    +5.2
So, I get smaller differences than Milos with my model. Given the fact that 2000 games are played in each level, I expect (at not very short time controls, where the draw ratio should rise with longer TC) uncertainties less than ±13 or ±14 Elo (with 95% confidence), that should not be forgotten. With the model I work: uncertainties are proportional to score(Houdini)*score(Komodo) - (draw_ratio)/4, so more draws mean less uncertainties. This model works fine in ranges of scores from 10% to 90%, which is the case here.

I wonder who of us (Milos and me) is more accurate in the extrapolation with the data given (I know that there are tons of ways of extrapolations). I see that in Milos' extrapolation, the missing numbers in 'Komodo gains' column are: {+12.5, +12.5, +12.5, +12.5, +12.5} (a very simple extrapolation, as Milos stated), which does not seem accurate for me because everyone can expect less gains with higher levels (in this sense, my extrapolation is a bit better). With cubic extrapolation, Milos gets -22 in 'HOUDINI' column (I assume that in level 10), which gets closer to my extrapolation, but is still a little far.

As each level doubles the previous level, maybe it could have more sense that my x axis was rescaled with logarithms of basis 2 (instead basis e), but it was much easier to me doing the math with natural logarithms. Comments, corrections, etc. are welcome, so please leave your impressions. Good luck to all the programmers with their engines!

Regards from Spain.

Ajedrecista.

Ajedrecista · Post by **Ajedrecista** » Tue Feb 28, 2012 12:31 pm

Hi Don!

That is very interesting stuff Jose!

Thanks for your interest, but my real name is Jesús and not José (they are not equivalent in Spanish-speaking countries, like Spain). I think that in Anglo-Saxon countries, naming a boy Jesús could be disrespectful, but not in Spanish-speaking countries. But I understand your error as a simple lapsus (without bad intention), so I am not annoyed. Nobody is perfect, specially me...

Now, the more interesting part of my post: I redid the least squares using logarithms of basis 2 in x axis and results are EXACTLY the same: n is in fact the same while the slope (m) of the adjusted line (Y) differs, but also the distance between points in x axis, so the table is exactly the same. I guess that the results will be the same with any possible basis for the logarithmic scale (i.e. 10) in x axis.

The new study combines both program from levels 00 to 08 and rates them all together in a massive round robin. The result will be plotted on a graph and I can handicap Houdini as Uri suggests by adjusting the x-axis, which is the equivalent of handicapping one of the programs by time. So if I add 0.5 to Houdini's X-AXIS for example it's like handicapping it by half a doubling or the same as giving Komodo 1.414 times more time.

It seems interesting giving more time to Komodo not only in integer quantities (2, 3, ...) or rational ones (3/2, 5/2, ...), but also with irrational quantities (such as sqrt(2)). The TC will be a little strange, even with approximations (I mean, not exactly sqrt(2) but for example 1.414 times, as you stated). I can only wish you good luck... and lots of patiente!

Everyone is eager (even me) with the future results of Komodo MP (4 or 4.1?) in CCRL, CEGT, Clemens' Base, IPON and other rating lists (no need of hurry). Congratulations for the triumph in CCT 14 blitz event!

Regards from Spain.

Ajedrecista.

Don · Post by **Don** » Tue Feb 28, 2012 12:57 pm

Ajedrecista wrote:Hi Don!

That is very interesting stuff Jose!
Thanks for your interest, but my real name is Jesús and not José (they are not equivalent in Spanish-speaking countries, like Spain). I think that in Anglo-Saxon countries, naming a boy Jesús could be disrespectful, but not in Spanish-speaking countries. But I understand your error as a simple lapsus (without bad intention), so I am not annoyed. Nobody is perfect, specially me...

I apologize for the mistake.

Now, the more interesting part of my post: I redid the least squares using logarithms of basis 2 in x axis and results are EXACTLY the same: n is in fact the same while the slope (m) of the adjusted line (Y) differs, but also the distance between points in x axis, so the table is exactly the same. I guess that the results will be the same with any possible basis for the logarithmic scale (i.e. 10) in x axis.

The new study combines both program from levels 00 to 08 and rates them all together in a massive round robin. The result will be plotted on a graph and I can handicap Houdini as Uri suggests by adjusting the x-axis, which is the equivalent of handicapping one of the programs by time. So if I add 0.5 to Houdini's X-AXIS for example it's like handicapping it by half a doubling or the same as giving Komodo 1.414 times more time.
It seems interesting giving more time to Komodo not only in integer quantities (2, 3, ...) or rational ones (3/2, 5/2, ...), but also with irrational quantities (such as sqrt(2)). The TC will be a little strange, even with approximations (I mean, not exactly sqrt(2) but for example 1.414 times, as you stated). I can only wish you good luck... and lots of patiente!

Everyone is eager (even me) with the future results of Komodo MP (4 or 4.1?) in CCRL, CEGT, Clemens' Base, IPON and other rating lists (no need of hurry). Congratulations for the triumph in CCT 14 blitz event!

Regards from Spain.

Ajedrecista.

Don · Post by **Don** » Tue Feb 28, 2012 1:06 pm

Ajedrecista wrote:Hi Don!

That is very interesting stuff Jose!
Thanks for your interest, but my real name is Jesús and not José (they are not equivalent in Spanish-speaking countries, like Spain). I think that in Anglo-Saxon countries, naming a boy Jesús could be disrespectful, but not in Spanish-speaking countries. But I understand your error as a simple lapsus (without bad intention), so I am not annoyed. Nobody is perfect, specially me...

Now, the more interesting part of my post: I redid the least squares using logarithms of basis 2 in x axis and results are EXACTLY the same: n is in fact the same while the slope (m) of the adjusted line (Y) differs, but also the distance between points in x axis, so the table is exactly the same. I guess that the results will be the same with any possible basis for the logarithmic scale (i.e. 10) in x axis.

The new study combines both program from levels 00 to 08 and rates them all together in a massive round robin. The result will be plotted on a graph and I can handicap Houdini as Uri suggests by adjusting the x-axis, which is the equivalent of handicapping one of the programs by time. So if I add 0.5 to Houdini's X-AXIS for example it's like handicapping it by half a doubling or the same as giving Komodo 1.414 times more time.
It seems interesting giving more time to Komodo not only in integer quantities (2, 3, ...) or rational ones (3/2, 5/2, ...), but also with irrational quantities (such as sqrt(2)). The TC will be a little strange, even with approximations (I mean, not exactly sqrt(2) but for example 1.414 times, as you stated). I can only wish you good luck... and lots of patiente!

Jesús,

We have found that no two programs are the same with respect to scaling behavior. Stockfish is very good, any of the Ippo based programs are poor about this. What I am interested in finding out is if the Ippo's are just very good at fast chess but eventually just reach a steady state. But when I still see the behavior (although much less pronounced) at higher levels it's difficult to imagine that it is not intrinsic to the program.

This is probably an artifact of how programs are tested. The good program are tuned with thousands of games played at very fast levels and each author has different ways of running the tests. It's impossible to get tests done in reasonable time unless you test very fast. As a result the program become especially well tuned to playing at these levels.

Everyone is eager (even me) with the future results of Komodo MP (4 or 4.1?) in CCRL, CEGT, Clemens' Base, IPON and other rating lists (no need of hurry). Congratulations for the triumph in CCT 14 blitz event!

A version of Komodo MP played in the CCT 14 blitz, but only 3 games were played with this version in the main tournament. We are working on getting it tuned to play well on quad hardware.

Regards from Spain.

Ajedrecista.

Jouni · Post by **Jouni** » Tue Feb 28, 2012 3:10 pm

But isn't "scaling" wrong term here? I think it's meaning how program benefits from more CPUs and Ippo family is not worse in this respect. But what's better word I don't have idea..

IGarcia · Post by **IGarcia** » Tue Feb 28, 2012 3:40 pm

Don wrote:
IGarcia wrote:
Don wrote: I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.
I asked to you several posts ago. Witch level you think Komodo will pass Houdini acording to your data? You never reply.

Its fair to ask because you are suggesting at some point komodo will pass. Will be interesting to hear the predictions from you.

respectfully, Ignacio.

I'm still trying to understand the data I am seeing so it would premature to make predictions. I don't remember you asking but if I did see it I probably was not prepared to give you an answer.

in the 40/120 CEGT rating list Komodo is showing as 1 ELO stronger than Houdini 1.5 based on 400 Komodo games and 600 Houdini games. These are time adjusted for older hardware so it's not really 40/120, it's something faster. The error margin there is something like 20 ELO which is somewhat similar to what Milos reports.

If you want to pin me down to a figure I cannot give you one, but I would say that at 40/120 on modern hardware we are within 10 ELO in either direction. The latest development version of Komodo will be about 20 more than that.

http://www.husvankempen.de/nunn//40120n ... liste.html

I asked several post ago. question was:
Still, maybe you are right. Can you tell us witch time control komodo will overtake?

And I don't want to pin you. I'm just curious to see the answer of so many predictions. So much efforts to estimate a hypothetical overtake ... ok, we are adults, everyone can decide what to do with his time.

Jesus, nice work, but is just suppositions. You imagine komodo will keep preforming in some way (not lineal as proposed before) and then this and that... then you conlusion, (no offense please) is imaginary.

regards

Ajedrecista · Post by **Ajedrecista** » Tue Feb 28, 2012 4:06 pm

Hello Ignacio:

Jesus, nice work, but is just suppositions. You imagine komodo will keep preforming in some way (not lineal as proposed before) and then this and that... then you conlusion, (no offense please) is imaginary.

I do not imagine anything...

I only extrapolated the known data in the way I explained. In my original post (where I write the results of my extrapolation) I wanted to say that all was a speculation, but I forgot it. My results are only extrapolations and not the absolute truth, so they should be understood as 'numerical guesses' and nothing more. Real data (unknown for higher levels) is all that matters.

I also did not want to assert any conclusion: I only wanted to post the results of my extrapolation and let the people compare our extrapolations (Milos' and mine). Sorry if my post seems to assert conclusions because it was not my intention. And, of course, there is no offense to me. Thanks for your comment, which has served to explain the reason of my first post: only share my extrapolated results.

Regards from Spain.

Ajedrecista.

Lion · Post by **Lion** » Tue Feb 28, 2012 5:52 pm

Will it also run a 6 core ?

regards

lkaufman · Post by **lkaufman** » Tue Feb 28, 2012 5:58 pm

The way we are doing MP now in Komodo, it will run on any number of cores, but six is about the maximum from which the benefit will be noticeable. In other words, it will always play stronger with more cores, but the benefit of using say 12 cores compared to 6 will be very small. We'll look for ways to improve this situation in the future.

Lion · Post by **Lion** » Tue Feb 28, 2012 6:03 pm

Makes sense to me.
6 cores is nearly a standart today, more cores means today that you are not anymore in "normal" computer world....

regards

Number 1 engine on long time controls

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls.