Number 1 engine on long time controls

Milos · Post by **Milos** » Mon Feb 27, 2012 6:22 pm

Milos wrote:Decline in Elo is because EBF decreases with longer TC's

A typo, should be "increases" instead.

Don · Post by **Don** » Mon Feb 27, 2012 6:35 pm

Milos wrote:
Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
Finally someone who understands and doesn't write the usual nonsense.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)
Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double.                                                                                                                                                 
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                               
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5                     
 Level 06 -   +14.5                                                            
 Leval 07 -   +2.0                                                                              
 Level 08 -   -10.5                                                                        
 Level 09 -   -23.0                                                                              
 Level 10 -   -35.5                                                       
At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)

I believe that Komodo will be stronger at 40/120 but using my data with the huge error margins from only 2000 games samples can throw you off by a significant amount. It seems like you would be smart enough to figure that out.

Also you should not put words in my mouth to make your criticism seem so dramatic. I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.

Nevertheless, I think you extrapolation is probably correct within 25 ELO or so but don't write that I claimed that, I just think it's a possibility.

In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).

I showed actual data in the interest of transparency and fairness. What data do you have to show me to support your assertion? Even if it turns out that I am wrong I DID actually supply data so that people would be free to draw their own conclusions, all you did was criticize. The interpretation of the data is something I believe but it's just an interpretation and I'm not claiming anything else.

So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...

IGarcia · Post by **IGarcia** » Mon Feb 27, 2012 6:58 pm

Don wrote: I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.

I asked to you several posts ago. Witch level you think Komodo will pass Houdini acording to your data? You never reply.

Its fair to ask because you are suggesting at some point komodo will pass. Will be interesting to hear the predictions from you.

respectfully, Ignacio.

Houdini · Post by **Houdini** » Mon Feb 27, 2012 7:26 pm

Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on ) current score:

Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%

We only know Komodo score when MP version is out.

Amazing results.
The ICGA World Champion getting trounced...

Robert

lkaufman · Post by **lkaufman** » Mon Feb 27, 2012 7:31 pm

Milos wrote:
Don wrote:I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play.
If you really think so you are even more ignorant than what I've thought.
More draws is not a reason for decline in Elo gain at all, that's just ridiculous.
Decline in Elo is because EBF decreases with longer TC's and you don't gain the same Elo per doubling anymore (you don't reach the next ply as fast as before).
Gain per additional ply is constant and it doesn't depend on absolute Elo value (absolute value of the searched depth), i.e. there is no diminishing return there or at least diminishing return is not measurable. This is proven in bunch of research papers.

Can you point me to any reasonably modern paper that comes to that conclusion? All of my own experience indicates the opposite, that each additional ply gains less elo than the previous one. I recall quite clearly reaching that conclusion with respect to Rybka.

Don · Post by **Don** » Mon Feb 27, 2012 7:38 pm

IGarcia wrote:
Don wrote: I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.
I asked to you several posts ago. Witch level you think Komodo will pass Houdini acording to your data? You never reply.

Its fair to ask because you are suggesting at some point komodo will pass. Will be interesting to hear the predictions from you.

respectfully, Ignacio.

I'm still trying to understand the data I am seeing so it would premature to make predictions. I don't remember you asking but if I did see it I probably was not prepared to give you an answer.

in the 40/120 CEGT rating list Komodo is showing as 1 ELO stronger than Houdini 1.5 based on 400 Komodo games and 600 Houdini games. These are time adjusted for older hardware so it's not really 40/120, it's something faster. The error margin there is something like 20 ELO which is somewhat similar to what Milos reports.

If you want to pin me down to a figure I cannot give you one, but I would say that at 40/120 on modern hardware we are within 10 ELO in either direction. The latest development version of Komodo will be about 20 more than that.

http://www.husvankempen.de/nunn//40120n ... liste.html

Don · Post by **Don** » Mon Feb 27, 2012 7:48 pm

lkaufman wrote:
Milos wrote:
Don wrote:I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play.
If you really think so you are even more ignorant than what I've thought.
More draws is not a reason for decline in Elo gain at all, that's just ridiculous.
Decline in Elo is because EBF decreases with longer TC's and you don't gain the same Elo per doubling anymore (you don't reach the next ply as fast as before).
Gain per additional ply is constant and it doesn't depend on absolute Elo value (absolute value of the searched depth), i.e. there is no diminishing return there or at least diminishing return is not measurable. This is proven in bunch of research papers.
Can you point me to any reasonably modern paper that comes to that conclusion? All of my own experience indicates the opposite, that each additional ply gains less elo than the previous one. I recall quite clearly reaching that conclusion with respect to Rybka.

I think Milos is talking about very old studies which did seem to indicate that programs scaled in linear fashion with depth. I remember Thompson showing 250 for each ply but not with reasonable sample sizes. That was based on depths that were pathetic by today's standards and also full width programs.

But I think Heinz did a very thorough study a few years later that showed convincing evidence that this really does fall off with depth and this was still many ply less than we can practically do today.

I think chessprogramming has a reference to this and if it's the same paper I remember it was called "New Self-Play Results in Computer chess."

It's possible that full width programs have different scaling characteristic per ply than modern programs that make heavy use of LMR and other aggressive pruning techniques.

Don · Post by **Don** » Mon Feb 27, 2012 7:50 pm

Houdini wrote:
Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on ) current score:

Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%

We only know Komodo score when MP version is out.
Amazing results.
The ICGA World Champion getting trounced...

Robert

A theory going around is that Junior is very scalable, but this data would indicate that this is not true.

Ajedrecista · Post by **Ajedrecista** » Mon Feb 27, 2012 7:50 pm

Hello:

Milos wrote:
Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
Finally someone who understands and doesn't write the usual nonsense.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)
Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double.                                                                                                                                                 
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                               
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5                     
 Level 06 -   +14.5                                                            
 Leval 07 -   +2.0                                                                              
 Level 08 -   -10.5                                                                        
 Level 09 -   -23.0                                                                              
 Level 10 -   -35.5                                                       
At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).

So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...

I have seen that Milos has done an extrapolation and I also want to post my own one. With the data provided by Don:

Code: Select all

Level where 00 is 6 + 0.1 and each successive level is double.                                              
                                                                                                            
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5

I have done an adjust by least squares by hand, with the only help of a Casio calculator (I hope no errors in my calculations). I explain what I have done a little more:

Code: Select all

x axis ---> the level (time control).
y axis ---> Houdini Elo advantage over Komodo.

I did a rescale in the x axis in the following way: x axis is in logarithmic scale (basis e), and I add 1 to each level for avoid ln(0). So, when I put ln(1), it means Don's level 0, and so on; level 10 pointed out by Milos is ln(11) in my x axis. I calculated the least squares by hand with a system of equations, in this way:

Code: Select all

Y(x) = m·ln(x) + n
Y numbers go to 'HOUDINI' column.

AX = B

Symmetric matrix of size 2x2: A = [N, SUM(x_i); SUM(x_i), SUM((x_i)²)]
Vector of size 2x1: X = [n, m]
Vector of size 2x1: B = [SUM(y_i), SUM((x_i)·(y_i))]

N is the number of (x, y) known data; here: N = 6.
SUM(x_i) = ln(1) + ln(2) + ... + ln(6) ~ 6.5793
SUM((x_i)²) =[ln(1)]² + [ln(2)]² + ... + [ln(6)]² ~ 9.4099
SUM(y_i) = 143.3 + 97 + ... + 27 = 434.2
SUM(x_i·y_i) = ln(1)·143.3 + ln(2)·97 + ... + ln(6)·27 ~ 334.3384

Det.(A) = N·SUM((x_i)²) - [SUM(x_i)]² ~ 13.1729

Solving by Cramer, I got (more less, roundings included):

Code: Select all

n ~ 143.1793
m ~ -64.5781

So, renaming the level names to their original names (from 0 to 10 instead from 1 to 11) and adding a couple of levels, this is what I get:

Code: Select all

OWN ADJUST:
===========

Level where 00 is 6 + 0.1 and each successive level is double.                                              
                                                                                                            
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                
 Level 00 -  +143.2
 Leval 01 -   +98.4   +44.8
 Leval 02 -   +72.2   +26.2
 Level 03 -   +53.7   +18.5
 Level 04 -   +39.2   +14.5
 Level 05 -   +27.5   +11.7
 Level 06 -   +17.5   +10
 Level 07 -    +8.9    +8.6
 Level 08 -    +1.3    +7.6
 Level 09 -    -5.5    +6.8
 Level 10 -   -11.7    +6.2
 Level 11 -   -17.3    +5.6
 Level 12 -   -22.5    +5.2

So, I get smaller differences than Milos with my model. Given the fact that 2000 games are played in each level, I expect (at not very short time controls, where the draw ratio should rise with longer TC) uncertainties less than ±13 or ±14 Elo (with 95% confidence), that should not be forgotten. With the model I work: uncertainties are proportional to score(Houdini)*score(Komodo) - (draw_ratio)/4, so more draws mean less uncertainties. This model works fine in ranges of scores from 10% to 90%, which is the case here.

I wonder who of us (Milos and me) is more accurate in the extrapolation with the data given (I know that there are tons of ways of extrapolations). I see that in Milos' extrapolation, the missing numbers in 'Komodo gains' column are: {+12.5, +12.5, +12.5, +12.5, +12.5} (a very simple extrapolation, as Milos stated), which does not seem accurate for me because everyone can expect less gains with higher levels (in this sense, my extrapolation is a bit better). With cubic extrapolation, Milos gets -22 in 'HOUDINI' column (I assume that in level 10), which gets closer to my extrapolation, but is still a little far.

As each level doubles the previous level, maybe it could have more sense that my x axis was rescaled with logarithms of basis 2 (instead basis e), but it was much easier to me doing the math with natural logarithms. Comments, corrections, etc. are welcome, so please leave your impressions. Good luck to all the programmers with their engines!

Regards from Spain.

Ajedrecista.

lkaufman · Post by **lkaufman** » Mon Feb 27, 2012 7:56 pm

Houdini wrote:
Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on ) current score:

Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%

We only know Komodo score when MP version is out.
Amazing results.
The ICGA World Champion getting trounced...

Robert

Why is it amazing? It is in line with the predictions of the rating lists. The top programs didn't compete in the ICGA championship for various reasons, so I see nothing surprising here.

Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls.

Re: Number 1 engine on long time controls