Based on this idea and what Uri also suggested (which I think are the same basic concept) I started a new study - one that will probably take weeks to complete because it's being run on a slow quad laptop - a spare machine that I do not use much.
The new study combines both program from levels 00 to 08 and rates them all together in a massive round robin. The result will be plotted on a graph and I can handicap Houdini as Uri suggests by adjusting the x-axis, which is the equivalent of handicapping one of the programs by time. So if I add 0.5 to Houdini's X-AXIS for example it's like handicapping it by half a doubling or the same as giving Komodo 1.414 times more time.
The idea is to make an adjustment that causes the lines to intersect at some arbitrary point near the center. The intersection point will define the ELO rating for which they are are equivalent given the specified handicap.
It looks like I will only get 20 or 30 games per day per player (18 players) so it will be a few days before there is even enough data to produce relatively stable lines and a few weeks for the lines to really be very precise.
Ajedrecista wrote:Hello:
I have seen that Milos has done an extrapolation and I also want to post my own one. With the data provided by Don:Milos wrote:Finally someone who understands and doesn't write the usual nonsense.Uri Blass wrote:I disagree about the following:
"if it is gaining more ELO per doubling that Houdni, then the logicial conclusion is that it WILL overtake it."
One example may be enough to prove that the conclusion is wrong and
I guess that houdini 32 bits gain more ELO per doubling than houdini 64 bits because of diminishing returns but houdini 32 bits is not going to overtake houdini 64 bits at long time control.
I think that you need to start with not equal time control but with time control that gives result that is close to 50%
If you find that the program that use more time earns more from doubling
in this case then it is more logical to think that it can beat the stronger program at long time control and even in this case it is not something that I feel sure about it.
I agree with everything you write.
Don is as usual writing a load of crap and pretending he doesn't understand (for marketing reasons).
He uses extremely short time controls where the initial difference between 2 engines is exaggerated and then tries to imply the conclusion on something that is more than 3 orders of magnitude longer TC. This is at least 10 doublings and just for fun I extrapolated his table to long TCs (40moves/120mins) using the simplest linear extrapolation.
So lets see (40moves/120mins is 3min/move and is approximately equivalent to 6000sec+100sec - 10 doublings - Level 10)At 40moves/120mins Komodo should be 35.5 Elo stronger than Houdini. (even with shape-preserving cubic extrapolation, at the end Komodo should still be 22Elo stronger!)Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double. Komodo HOUDINI gains ------- ------ Level 00 - +143.3 Leval 01 - +97.0 +46.3 Leval 02 - +74.6 +22.4 Level 03 - +52.8 +21.8 Level 04 - +39.5 +13.3 Level 05 - +27.0 +12.5 Level 06 - +14.5 Leval 07 - +2.0 Level 08 - -10.5 Level 09 - -23.0 Level 10 - -35.5
In reality Komodo is not stronger at any time control, eventually there might be a point where it becomes equal (asymptotic point with extremely large time per move).
So Don, please stop with nonsense, not all the ppl here are idiots to believe your marketing...
I have done an adjust by least squares by hand, with the only help of a Casio calculator (I hope no errors in my calculations). I explain what I have done a little more:Code: Select all
Level where 00 is 6 + 0.1 and each successive level is double. Komodo HOUDINI gains ------- ------ Level 00 - +143.3 Leval 01 - +97.0 +46.3 Leval 02 - +74.6 +22.4 Level 03 - +52.8 +21.8 Level 04 - +39.5 +13.3 Level 05 - +27.0 +12.5
I did a rescale in the x axis in the following way: x axis is in logarithmic scale (basis e), and I add 1 to each level for avoid ln(0). So, when I put ln(1), it means Don's level 0, and so on; level 10 pointed out by Milos is ln(11) in my x axis. I calculated the least squares by hand with a system of equations, in this way:Code: Select all
x axis ---> the level (time control). y axis ---> Houdini Elo advantage over Komodo.
Solving by Cramer, I got (more less, roundings included):Code: Select all
Y(x) = m·ln(x) + n Y numbers go to 'HOUDINI' column. AX = B Symmetric matrix of size 2x2: A = [N, SUM(x_i); SUM(x_i), SUM((x_i)²)] Vector of size 2x1: X = [n, m] Vector of size 2x1: B = [SUM(y_i), SUM((x_i)·(y_i))] N is the number of (x, y) known data; here: N = 6. SUM(x_i) = ln(1) + ln(2) + ... + ln(6) ~ 6.5793 SUM((x_i)²) =[ln(1)]² + [ln(2)]² + ... + [ln(6)]² ~ 9.4099 SUM(y_i) = 143.3 + 97 + ... + 27 = 434.2 SUM(x_i·y_i) = ln(1)·143.3 + ln(2)·97 + ... + ln(6)·27 ~ 334.3384 Det.(A) = N·SUM((x_i)²) - [SUM(x_i)]² ~ 13.1729
So, renaming the level names to their original names (from 0 to 10 instead from 1 to 11) and adding a couple of levels, this is what I get:Code: Select all
n ~ 143.1793 m ~ -64.5781
So, I get smaller differences than Milos with my model. Given the fact that 2000 games are played in each level, I expect (at not very short time controls, where the draw ratio should rise with longer TC) uncertainties less than ±13 or ±14 Elo (with 95% confidence), that should not be forgotten. With the model I work: uncertainties are proportional to score(Houdini)*score(Komodo) - (draw_ratio)/4, so more draws mean less uncertainties. This model works fine in ranges of scores from 10% to 90%, which is the case here.Code: Select all
OWN ADJUST: =========== Level where 00 is 6 + 0.1 and each successive level is double. Komodo HOUDINI gains ------- ------ Level 00 - +143.2 Leval 01 - +98.4 +44.8 Leval 02 - +72.2 +26.2 Level 03 - +53.7 +18.5 Level 04 - +39.2 +14.5 Level 05 - +27.5 +11.7 Level 06 - +17.5 +10 Level 07 - +8.9 +8.6 Level 08 - +1.3 +7.6 Level 09 - -5.5 +6.8 Level 10 - -11.7 +6.2 Level 11 - -17.3 +5.6 Level 12 - -22.5 +5.2
I wonder who of us (Milos and me) is more accurate in the extrapolation with the data given (I know that there are tons of ways of extrapolations). I see that in Milos' extrapolation, the missing numbers in 'Komodo gains' column are: {+12.5, +12.5, +12.5, +12.5, +12.5} (a very simple extrapolation, as Milos stated), which does not seem accurate for me because everyone can expect less gains with higher levels (in this sense, my extrapolation is a bit better). With cubic extrapolation, Milos gets -22 in 'HOUDINI' column (I assume that in level 10), which gets closer to my extrapolation, but is still a little far.
As each level doubles the previous level, maybe it could have more sense that my x axis was rescaled with logarithms of basis 2 (instead basis e), but it was much easier to me doing the math with natural logarithms. Comments, corrections, etc. are welcome, so please leave your impressions. Good luck to all the programmers with their engines!
Regards from Spain.
Ajedrecista.