Hi George:
geots wrote:Houdini 2.0c x64 v Engine 40x(2) -
UPDATE 6
This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.
Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.
Code: Select all
Houdini 2.0c x64 +29 +188/-138/=280 54.13% 328.0/606
Engine 40x(2) -29 +138/-188/=280 45.87% 278.0/606
Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.
Until tomorrow-
george
I guess that you thought Houdini ahead 40x(2) in (+25 ± 5 ) Elo, that is, between +20 Elo and +30 Elo. If I am right, those Elo advantages will be reached between 529 - 471 and 543 - 457 (Houdini wins, of course). In other words, Houdini should score between 201/394 (~ 51.02%) and 215/394 (~ 54.57%) to be between +20 Elo and +30 Elo ahead after 1000 games... who knows?
Here are my error bars for this match update:
Code: Select all
Elo_uncertainties_calculator, ® 2012.
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
188
Write down the number of loses:
138
Write down the number of draws:
280
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 18.40 Elo
Upper rating difference: 39.12 Elo
Lower bound uncertainty: -10.33 Elo
Upper bound uncertainty: 10.39 Elo
Average error: +/- 10.36 Elo
K = (average error)*[sqrt(n)] = 255.02
Elo interval: ] 18.40, 39.12[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 8.10 Elo
Upper rating difference: 49.57 Elo
Lower bound uncertainty: -20.64 Elo
Upper bound uncertainty: 20.84 Elo
Average error: +/- 20.74 Elo
K = (average error)*[sqrt(n)] = 510.51
Elo interval: ] 8.10, 49.57[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: -2.19 Elo
Upper rating difference: 60.12 Elo
Lower bound uncertainty: -30.92 Elo
Upper bound uncertainty: 31.39 Elo
Average error: +/- 31.15 Elo
K = (average error)*[sqrt(n)] = 766.93
Elo interval: ] -2.19, 60.12[
---------------------------------------
Number of games of the match: 606
Score: 54.13 %
Elo rating difference: 28.73 Elo
Draw ratio: 46.20 %
**********************************************
1 sigma: 1.4803 % of the points of the match.
2 sigma: 2.9605 % of the points of the match.
3 sigma: 4.4408 % of the points of the match.
**********************************************
End of the calculations.
Thanks for using Elo_uncertainties_calculator. Press Enter to exit.
I have just refined this programme in the last part of the code (2 sigma ~ 2.9605%, etc.) because I realized that:
Code: Select all
2d-4*nint(1d6*sigma,KIND=3)
3d-4*nint(1d6*sigma,KIND=3)
Is not the same as:
Code: Select all
1d-4*nint(2d6*sigma,KIND=3)
1d-4*nint(3d6*sigma,KIND=3)
The correct one is the last code box (that is now in the code of my programme); the first one is a bad rounding (one more!), so I noticed some little strange things; but I think that finally all is OK.
Regarding the minimum score for avoiding negative Elo gains
with a given confidence interval, this is what I get for this update (using my imperfect model):
Code: Select all
90% confidence: 318 points for Houdini.
95% confidence: 321 points for Houdini.
98% confidence: 324 points for Houdini.
99% confidence: 326.5 points for Houdini.
99.5% confidence: 328.5 points for Houdini.
So, Houdini is better with more than 99% confidence and less than 99.5% confidence after these 606 games! I guess that Houdini is too much Houdini...
In other update I posted the following info:
Code: Select all
Write down the confidence level (in percentage) between 75% and 99.9%:
95
Calculating...
Theoretical minimum score for no regression: 53.5564 %
Theoretical standard deviation in this case: 1.8145 %
This standard deviation is just one standard deviation (roundings included); as 95% confidence is more less 1.96-sigma confidence, you can see that (53.5564 - 50)/1.8145 ~ 1.96... but I should print directly 3.5564% instead of 1.8145% for avoiding confusions, just as I do in Elo_uncertainties_calculator, where I found my last rounding error (my bad), so I go for a very short fix: only two characters (add
k*)!
Code: Select all
1d-4*nint(1d6*sigma(5),KIND=3) ! The one that can bring confusion.
1d-4*nint(1d6*k*sigma(5),KIND=3) ! The best choice.
Thank you very much for this match. I stay tuned for the next update!
Regards from Spain.
Ajedrecista.