A typo, should be "increases" instead.Milos wrote:Decline in Elo is because EBF decreases with longer TC's
Number 1 engine on long time controls
Moderator: Ras
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: Number 1 engine on long time controls
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Number 1 engine on long time controls
I believe that Komodo will be stronger at 40/120 but using my data with the huge error margins from only 2000 games samples can throw you off by a significant amount. It seems like you would be smart enough to figure that out.Milos wrote:Finally someone who understands and doesn't write the usual nonsense.Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."
One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.
I think that you need to start with not equal time control but with time control that gives result that is close to 50%
If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double. Komodo HOUDINI gains ------- ------ Level 00 - +143.3 Leval 01 - +97.0 +46.3 Leval 02 - +74.6 +22.4 Level 03 - +52.8 +21.8 Level 04 - +39.5 +13.3 Level 05 - +27.0 +12.5 Level 06 - +14.5 Leval 07 - +2.0 Level 08 - -10.5 Level 09 - -23.0 Level 10 - -35.5
Also you should not put words in my mouth to make your criticism seem so dramatic. I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.
Nevertheless, I think you extrapolation is probably correct within 25 ELO or so but don't write that I claimed that, I just think it's a possibility.
I showed actual data in the interest of transparency and fairness. What data do you have to show me to support your assertion? Even if it turns out that I am wrong I DID actually supply data so that people would be free to draw their own conclusions, all you did was criticize. The interpretation of the data is something I believe but it's just an interpretation and I'm not claiming anything else.
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).
So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 543
- Joined: Mon Jul 05, 2010 10:27 pm
Re: Number 1 engine on long time controls
I asked to you several posts ago. Witch level you think Komodo will pass Houdini acording to your data? You never reply.Don wrote: I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.
Its fair to ask because you are suggesting at some point komodo will pass. Will be interesting to hear the predictions from you.
respectfully, Ignacio.
-
- Posts: 1471
- Joined: Tue Mar 16, 2010 12:00 am
Re: Number 1 engine on long time controls
Amazing results.Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on) current score:
Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%
We only know Komodo score when MP version is out.
The ICGA World Champion getting trounced...
Robert
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Number 1 engine on long time controls
Can you point me to any reasonably modern paper that comes to that conclusion? All of my own experience indicates the opposite, that each additional ply gains less elo than the previous one. I recall quite clearly reaching that conclusion with respect to Rybka.Milos wrote:If you really think so you are even more ignorant than what I've thought.Don wrote:I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play.
More draws is not a reason for decline in Elo gain at all, that's just ridiculous.
Decline in Elo is because EBF decreases with longer TC's and you don't gain the same Elo per doubling anymore (you don't reach the next ply as fast as before).
Gain per additional ply is constant and it doesn't depend on absolute Elo value (absolute value of the searched depth), i.e. there is no diminishing return there or at least diminishing return is not measurable. This is proven in bunch of research papers.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Number 1 engine on long time controls
IGarcia wrote:I asked to you several posts ago. Witch level you think Komodo will pass Houdini acording to your data? You never reply.Don wrote: I never tried to predict that it would be 35 ELO stronger nor did I even say at which level it would pass Houdini.
Its fair to ask because you are suggesting at some point komodo will pass. Will be interesting to hear the predictions from you.
respectfully, Ignacio.
I'm still trying to understand the data I am seeing so it would premature to make predictions. I don't remember you asking but if I did see it I probably was not prepared to give you an answer.
in the 40/120 CEGT rating list Komodo is showing as 1 ELO stronger than Houdini 1.5 based on 400 Komodo games and 600 Houdini games. These are time adjusted for older hardware so it's not really 40/120, it's something faster. The error margin there is something like 20 ELO which is somewhat similar to what Milos reports.
If you want to pin me down to a figure I cannot give you one, but I would say that at 40/120 on modern hardware we are within 10 ELO in either direction. The latest development version of Komodo will be about 20 more than that.
http://www.husvankempen.de/nunn//40120n ... liste.html
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Number 1 engine on long time controls
I think Milos is talking about very old studies which did seem to indicate that programs scaled in linear fashion with depth. I remember Thompson showing 250 for each ply but not with reasonable sample sizes. That was based on depths that were pathetic by today's standards and also full width programs.lkaufman wrote:Can you point me to any reasonably modern paper that comes to that conclusion? All of my own experience indicates the opposite, that each additional ply gains less elo than the previous one. I recall quite clearly reaching that conclusion with respect to Rybka.Milos wrote:If you really think so you are even more ignorant than what I've thought.Don wrote:I did not mention the draw factor either. As programs get stronger there are more and more draws which could be the biggest reason we see a decline in all programs with more time. I see this as a natural consequence of the fact that as programs think longer they get a little closer to perfect play.
More draws is not a reason for decline in Elo gain at all, that's just ridiculous.
Decline in Elo is because EBF decreases with longer TC's and you don't gain the same Elo per doubling anymore (you don't reach the next ply as fast as before).
Gain per additional ply is constant and it doesn't depend on absolute Elo value (absolute value of the searched depth), i.e. there is no diminishing return there or at least diminishing return is not measurable. This is proven in bunch of research papers.
But I think Heinz did a very thorough study a few years later that showed convincing evidence that this really does fall off with depth and this was still many ply less than we can practically do today.
I think chessprogramming has a reference to this and if it's the same paper I remember it was called "New Self-Play Results in Computer chess."
It's possible that full width programs have different scaling characteristic per ply than modern programs that make heavy use of LMR and other aggressive pruning techniques.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Number 1 engine on long time controls
A theory going around is that Junior is very scalable, but this data would indicate that this is not true.Houdini wrote:Amazing results.Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on) current score:
Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%
We only know Komodo score when MP version is out.
The ICGA World Champion getting trounced...
Robert
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 2132
- Joined: Wed Jul 13, 2011 9:04 pm
- Location: Madrid, Spain.
Re: Number 1 engine on long time controls.
Hello:
I have done an adjust by least squares by hand, with the only help of a Casio calculator (I hope no errors in my calculations). I explain what I have done a little more:
I did a rescale in the x axis in the following way: x axis is in logarithmic scale (basis e), and I add 1 to each level for avoid ln(0). So, when I put ln(1), it means Don's level 0, and so on; level 10 pointed out by Milos is ln(11) in my x axis. I calculated the least squares by hand with a system of equations, in this way:
Solving by Cramer, I got (more less, roundings included):
So, renaming the level names to their original names (from 0 to 10 instead from 1 to 11) and adding a couple of levels, this is what I get:
So, I get smaller differences than Milos with my model. Given the fact that 2000 games are played in each level, I expect (at not very short time controls, where the draw ratio should rise with longer TC) uncertainties less than ±13 or ±14 Elo (with 95% confidence), that should not be forgotten. With the model I work: uncertainties are proportional to score(Houdini)*score(Komodo) - (draw_ratio)/4, so more draws mean less uncertainties. This model works fine in ranges of scores from 10% to 90%, which is the case here.
I wonder who of us (Milos and me) is more accurate in the extrapolation with the data given (I know that there are tons of ways of extrapolations). I see that in Milos' extrapolation, the missing numbers in 'Komodo gains' column are: {+12.5, +12.5, +12.5, +12.5, +12.5} (a very simple extrapolation, as Milos stated), which does not seem accurate for me because everyone can expect less gains with higher levels (in this sense, my extrapolation is a bit better). With cubic extrapolation, Milos gets -22 in 'HOUDINI' column (I assume that in level 10), which gets closer to my extrapolation, but is still a little far.
As each level doubles the previous level, maybe it could have more sense that my x axis was rescaled with logarithms of basis 2 (instead basis e), but it was much easier to me doing the math with natural logarithms. Comments, corrections, etc. are welcome, so please leave your impressions. Good luck to all the programmers with their engines!
Regards from Spain.
Ajedrecista.
I have seen that Milos has done an extrapolation and I also want to post my own one. With the data provided by Don:Milos wrote:Finally someone who understands and doesn't write the usual nonsense.Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."
One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.
I think that you need to start with not equal time control but with time control that gives result that is close to 50%
If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double. Komodo HOUDINI gains ------- ------ Level 00 - +143.3 Leval 01 - +97.0 +46.3 Leval 02 - +74.6 +22.4 Level 03 - +52.8 +21.8 Level 04 - +39.5 +13.3 Level 05 - +27.0 +12.5 Level 06 - +14.5 Leval 07 - +2.0 Level 08 - -10.5 Level 09 - -23.0 Level 10 - -35.5
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).
So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...
Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double.
Komodo
HOUDINI gains
------- ------
Level 00 - +143.3
Leval 01 - +97.0 +46.3
Leval 02 - +74.6 +22.4
Level 03 - +52.8 +21.8
Level 04 - +39.5 +13.3
Level 05 - +27.0 +12.5
Code: Select all
x axis ---> the level (time control).
y axis ---> Houdini Elo advantage over Komodo.
Code: Select all
Y(x) = m·ln(x) + n
Y numbers go to 'HOUDINI' column.
AX = B
Symmetric matrix of size 2x2: A = [N, SUM(x_i); SUM(x_i), SUM((x_i)²)]
Vector of size 2x1: X = [n, m]
Vector of size 2x1: B = [SUM(y_i), SUM((x_i)·(y_i))]
N is the number of (x, y) known data; here: N = 6.
SUM(x_i) = ln(1) + ln(2) + ... + ln(6) ~ 6.5793
SUM((x_i)²) =[ln(1)]² + [ln(2)]² + ... + [ln(6)]² ~ 9.4099
SUM(y_i) = 143.3 + 97 + ... + 27 = 434.2
SUM(x_i·y_i) = ln(1)·143.3 + ln(2)·97 + ... + ln(6)·27 ~ 334.3384
Det.(A) = N·SUM((x_i)²) - [SUM(x_i)]² ~ 13.1729
Code: Select all
n ~ 143.1793
m ~ -64.5781
Code: Select all
OWN ADJUST:
===========
Level where 00 is 6 + 0.1 and each successive level is double.
Komodo
HOUDINI gains
------- ------
Level 00 - +143.2
Leval 01 - +98.4 +44.8
Leval 02 - +72.2 +26.2
Level 03 - +53.7 +18.5
Level 04 - +39.2 +14.5
Level 05 - +27.5 +11.7
Level 06 - +17.5 +10
Level 07 - +8.9 +8.6
Level 08 - +1.3 +7.6
Level 09 - -5.5 +6.8
Level 10 - -11.7 +6.2
Level 11 - -17.3 +5.6
Level 12 - -22.5 +5.2
I wonder who of us (Milos and me) is more accurate in the extrapolation with the data given (I know that there are tons of ways of extrapolations). I see that in Milos' extrapolation, the missing numbers in 'Komodo gains' column are: {+12.5, +12.5, +12.5, +12.5, +12.5} (a very simple extrapolation, as Milos stated), which does not seem accurate for me because everyone can expect less gains with higher levels (in this sense, my extrapolation is a bit better). With cubic extrapolation, Milos gets -22 in 'HOUDINI' column (I assume that in level 10), which gets closer to my extrapolation, but is still a little far.
As each level doubles the previous level, maybe it could have more sense that my x axis was rescaled with logarithms of basis 2 (instead basis e), but it was much easier to me doing the math with natural logarithms. Comments, corrections, etc. are welcome, so please leave your impressions. Good luck to all the programmers with their engines!
Regards from Spain.
Ajedrecista.
Last edited by Ajedrecista on Mon Feb 27, 2012 8:03 pm, edited 1 time in total.
-
- Posts: 6259
- Joined: Sun Jan 10, 2010 6:15 am
- Location: Maryland USA
- Full name: Larry Kaufman
Re: Number 1 engine on long time controls
Why is it amazing? It is in line with the predictions of the rating lists. The top programs didn't compete in the ICGA championship for various reasons, so I see nothing surprising here.Houdini wrote:Amazing results.Jouni wrote:In Deep Junior challenge (125m/50moves 6 CPU ponder on) current score:
Houdini 78,4%
Rybka 69,0%
Critter 67,1%
Stockfish 65,5%
We only know Komodo score when MP version is out.
The ICGA World Champion getting trounced...
Robert