Number 1 engine on long time controls

Don · Post by **Don** » Sun Feb 26, 2012 5:51 pm

IGarcia wrote:
Don wrote: 1 ELO doesn't mean anything , the error margin is over 20 ELO for both programs. My point is that this is pretty powerful evidence that Komodo scales better than Houdini 1.5, I don't know the precise level where it becomes superior but one would have to bury his head in the sand and ignore all data to believe that this is not the logical conclusion.
Why it has to become superior?

It's doesn't HAVE to become superior, but if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it. A less logical conclusion is that it will suddenly stop improving faster than Houdini, but you should at least present some rationale for thinking that is likely to happen.

I understand the trend, witch I see as a fact, is because more time = less blunders. By having more time is possible to make less mistakes, find some difficult to find answer to escape into a draw in a very complex position. The trend you see probably is just because komodo is losing less games (not winning more).

It can be said that less errors is the same that better move, but I'm meaning other thing: With better moves you can gain initiative and games, with less errors you can equalize or save some games.

Don wrote: It seems oddly illogical to think that the trend will suddenly reverse or stop - do you have some sound basis for thinking that is likely or are you just saying that anything is possible, even when there is no supporting evidence we can identify? I am willing to entertain any good theories you have on this and maybe they can even be tested.
Code: Select all
World English Dictionary
asymptote  (ˈæsɪmˌtəʊt)
 
— n
	a straight line that is closely approached by a plane curve so that the perpendicular distance between them decreases to zero as the distance from the origin increases to infinity
 
[C17: from Greek asumptōtos  not falling together, from a- 1  + syn-  + ptōtos  inclined to fall, from piptein  to fall]

Collins English Dictionary - Complete & Unabridged 10th Edition
2009 © William Collins Sons & Co. Ltd. 1979, 1986 © HarperCollins
Publishers 1998, 2000, 2003, 2005, 2006, 2007, 2009
Cite This Source 

IGarcia · Post by **IGarcia** » Sun Feb 26, 2012 6:34 pm

Don wrote:
It's doesn't HAVE to become superior, but if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it. A less logical conclusion is that it will suddenly stop improving faster than Houdini, but you should at least present some rationale for thinking that is likely to happen.

You are asuming the trend wil no change. where is the proof you measurments are lineal, or some other function?

I'm not speaking about a trend "sudently stop" but i dont buy the continuos improving you speak.

Is hard for me not english native to explain, but you cant prove you right because i cant prove you are incorrect.

Your way of thinking is why many people jump into stocks market and get burned when bubbles burst.

Its much more logical to think at long times the differences are lower, than your thinking at double time is 2x better.

Still, maybe you are right. Can you tell us witch time control komodo will overtake?

rbarreira · Post by **rbarreira** » Sun Feb 26, 2012 8:54 pm

IGarcia wrote:
Don wrote:
It's doesn't HAVE to become superior, but if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it. A less logical conclusion is that it will suddenly stop improving faster than Houdini, but you should at least present some rationale for thinking that is likely to happen.
You are asuming the trend wil no change. where is the proof you measurments are lineal, or some other function?

I'm not speaking about a trend "sudently stop" but i dont buy the continuos improving you speak.

Is hard for me not english native to explain, but you cant prove you right because i cant prove you are incorrect.

Your way of thinking is why many people jump into stocks market and get burned when bubbles burst.

Its much more logical to think at long times the differences are lower, than your thinking at double time is 2x better.

Still, maybe you are right. Can you tell us witch time control komodo will overtake?

The trend could indeed suddenly stop, if it was simply due to more draws at longer time controls and therefore the weaker engine catching up as the time control gets longer. But I guess Don has considered this...

If that's not the explanation, what could explain Komodo scaling better to longer time controls? Would a lower branching factor explain it, or better eval perhaps (does better eval even improve scaling?).

Uri Blass · Post by **Uri Blass** » Sun Feb 26, 2012 11:53 pm

Don wrote:
It's doesn't HAVE to become superior, but if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it. A less logical conclusion is that it will suddenly stop improving faster than Houdini, but you should at least present some rationale for thinking that is likely to happen.

I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.

Don · Post by **Don** » Mon Feb 27, 2012 2:35 pm

Uri Blass wrote:
Don wrote:
It's doesn't HAVE to become superior, but if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it. A less logical conclusion is that it will suddenly stop improving faster than Houdini, but you should at least present some rationale for thinking that is likely to happen.

I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

That's true, but I am assuming that you are talking about 2 programs that are already in the same general strength category which make the effect you are speaking about a minor thing. You would not expect Houdini 32 to gain 20 ELO every time you double the level against Houdini 64, you probably would see something like 5 ELO or less.

But you do have a valid point. I could have rated all the players in a big pool like I did in this other study here:

http://komodochess.com/pub/scale.png

This is a graph where the X-axis is time (on a logarithmic scale) and the y-axis is the ELO rating. We can make the adjustment you suggest by simply moving any given line forward or backwards in the y (time) axis to "normalize" the ELO with time adjustments.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

I'm going to restart the study and rate the programs together instead in order to address this concern.

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.

I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play. One could imagine that if the only thing happening is that Komodo was just drawing more games we would see it asymptotically approach Houdini's strength but never quite reach it.

So I rated all the games that were NOT draws to see if Houdini's advantage was more or less constant in decisive games - and the results were very similar, Komodo continues to win more decisive games with each doubling. I'm not sure how to interpret that result or if it has any particular significance but I thought it was interesting.

Here are the 2 tables - you can see the ELO change in the oppo column:

Code: Select all

All data including draws:

Rank Name      Elo      +      -    games   score   oppo.   draws                                    
   1 hou-00  3000.0   16.4   16.4    2000   68.1%  2865.7   21.7%                                    
   1 hou-01  3000.0   15.9   15.9    2000   63.5%  2903.0   24.5%                                    
   1 hou-02  3000.0   15.5   15.5    2000   60.8%  2925.4   28.7%                                    
   1 hou-03  3000.0   15.4   15.4    2000   57.7%  2947.2   29.7%                                    
   1 hou-04  3000.0   15.2   15.2    2000   55.8%  2960.5   32.1%                                    
   1 hou-05  3000.0   15.5   15.5    1867   54.1%  2972.9   35.4%                                    
                                                                                                     
                                                                                                     
With draws removed:                                                                                  
                                                                                                     
Rank Name      Elo      +      -    games   score   oppo.   draws                                    
   1 hou-00  3000.0   21.7   21.7    1567   73.1%  2787.9    0.0%                                    
   1 hou-01  3000.0   21.1   21.1    1510   67.9%  2839.7    0.0%                                    
   1 hou-02  3000.0   21.4   21.4    1426   65.1%  2865.7    0.0%                                    
   1 hou-03  3000.0   21.1   21.1    1407   61.0%  2902.4    0.0%                                    
   1 hou-04  3000.0   21.3   21.3    1359   58.6%  2924.6    0.0%                                    
   1 hou-05  3000.0   22.5   22.5    1202   56.2%  2945.6    0.0%

Jouni · Post by **Jouni** » Mon Feb 27, 2012 5:18 pm

In Deep Junior challenge (125m/50moves 6 CPU ponder on

) current score:

Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%

We only know Komodo score when MP version is out.

Milos · Post by **Milos** » Mon Feb 27, 2012 5:52 pm

Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."

One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.

I think that you need to start with not equal time control but with time control that gives result that is close to 50%

If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.

Finally someone who understands and doesn't write the usual nonsense.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)

Code: Select all

Level where 00 is 6 + 0.1 and each successive level is double.                                                                                                                                                 
                     Komodo                                                                                
            HOUDINI   gains                                                                                
            -------  ------                                                                                               
 Level 00 -  +143.3                                                                                        
 Leval 01 -   +97.0   +46.3                                                                                
 Leval 02 -   +74.6   +22.4                                                                                
 Level 03 -   +52.8   +21.8                                                                                
 Level 04 -   +39.5   +13.3                                                                                
 Level 05 -   +27.0   +12.5                     
 Level 06 -   +14.5                                                            
 Leval 07 -   +2.0                                                                              
 Level 08 -   -10.5                                                                        
 Level 09 -   -23.0                                                                              
 Level 10 -   -35.5

At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).

So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...

Milos · Post by **Milos** » Mon Feb 27, 2012 6:04 pm

Don wrote:I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play.

If you really think so you are even more ignorant than what I've thought.
More draws is not a reason for decline in Elo gain at all, that's just ridiculous.
Decline in Elo is because EBF decreases with longer TC's and you don't gain the same Elo per doubling anymore (you don't reach the next ply as fast as before).
Gain per additional ply is constant and it doesn't depend on absolute Elo value (absolute value of the searched depth), i.e. there is no diminishing return there or at least diminishing return is not measurable. This is proven in bunch of research papers.

lkaufman · Post by **lkaufman** » Mon Feb 27, 2012 6:11 pm

"In reality Komodo is not stronger at any time control". What is your basis for this statement? I admit the one elo lead for Komodo 4 on the 40/2hr list is not significant, but what reason is there to doubt that things would only get better for Komodo relative to Houdini with even more time?

Milos · Post by **Milos** » Mon Feb 27, 2012 6:19 pm

lkaufman wrote:"In reality Komodo is not stronger at any time control". What is your basis for this statement? I admit the one elo lead for Komodo 4 on the 40/2hr list is not significant, but what reason is there to doubt that things would only get better for Komodo relative to Houdini with even more time?

If you understand the principles of interpolation with the points that Don published (lets say they are truthful) and the point at 40/120 from CEGT it is clear that even at infinite TC Komodo can not surpass Houdini (they will remain equal in strength i.e. their difference will stay indistinguishable) assuming monotonicity of first derivative of Elo gain per doubling curve.
Assuming anything else is just wishful thinking and is the exact thing Don is presenting here.

Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls

Re: Number 1 engine on long time controls