| View previous topic :: View next topic |
| Author |
Message |
George Speight

Joined: 10 Mar 2006 Posts: 4636
|
Posted: Wed Jun 13, 2012 3:10 am Post subject: TUESDAY UPDATE- 40x(2) v Houdini 2.0c! |
|
|
Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6
This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.
Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.
| Code: |
Houdini 2.0c x64 +29 +188/-138/=280 54.13% 328.0/606
Engine 40x(2) -29 +138/-188/=280 45.87% 278.0/606 |
Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.
Until tomorrow-
george |
|
| Back to top |
|
 |
Jesús Muñoz

Joined: 13 Jul 2011 Posts: 707 Location: Madrid, Spain.
|
Posted: Wed Jun 13, 2012 9:51 am Post subject: Re: TUESDAY UPDATE - 40x(2) vs. Houdini 2.0c! |
|
|
Hi George:
| geots wrote: |
Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6
This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.
Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.
| Code: |
Houdini 2.0c x64 +29 +188/-138/=280 54.13% 328.0/606
Engine 40x(2) -29 +138/-188/=280 45.87% 278.0/606 |
Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.
Until tomorrow-
george |
I guess that you thought Houdini ahead 40x(2) in (+25 ± 5 ) Elo, that is, between +20 Elo and +30 Elo. If I am right, those Elo advantages will be reached between 529 - 471 and 543 - 457 (Houdini wins, of course). In other words, Houdini should score between 201/394 (~ 51.02%) and 215/394 (~ 54.57%) to be between +20 Elo and +30 Elo ahead after 1000 games... who knows?
Here are my error bars for this match update:
| Code: |
Elo_uncertainties_calculator, ® 2012.
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
188
Write down the number of loses:
138
Write down the number of draws:
280
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 18.40 Elo
Upper rating difference: 39.12 Elo
Lower bound uncertainty: -10.33 Elo
Upper bound uncertainty: 10.39 Elo
Average error: +/- 10.36 Elo
K = (average error)*[sqrt(n)] = 255.02
Elo interval: ] 18.40, 39.12[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 8.10 Elo
Upper rating difference: 49.57 Elo
Lower bound uncertainty: -20.64 Elo
Upper bound uncertainty: 20.84 Elo
Average error: +/- 20.74 Elo
K = (average error)*[sqrt(n)] = 510.51
Elo interval: ] 8.10, 49.57[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: -2.19 Elo
Upper rating difference: 60.12 Elo
Lower bound uncertainty: -30.92 Elo
Upper bound uncertainty: 31.39 Elo
Average error: +/- 31.15 Elo
K = (average error)*[sqrt(n)] = 766.93
Elo interval: ] -2.19, 60.12[
---------------------------------------
Number of games of the match: 606
Score: 54.13 %
Elo rating difference: 28.73 Elo
Draw ratio: 46.20 %
**********************************************
1 sigma: 1.4803 % of the points of the match.
2 sigma: 2.9605 % of the points of the match.
3 sigma: 4.4408 % of the points of the match.
**********************************************
End of the calculations.
Thanks for using Elo_uncertainties_calculator. Press Enter to exit.
|
I have just refined this programme in the last part of the code (2 sigma ~ 2.9605%, etc.) because I realized that:
| Code: |
2d-4*nint(1d6*sigma,KIND=3)
3d-4*nint(1d6*sigma,KIND=3) |
Is not the same as:
| Code: |
1d-4*nint(2d6*sigma,KIND=3)
1d-4*nint(3d6*sigma,KIND=3) |
The correct one is the last code box (that is now in the code of my programme); the first one is a bad rounding (one more!), so I noticed some little strange things; but I think that finally all is OK.
Regarding the minimum score for avoiding negative Elo gains with a given confidence interval, this is what I get for this update (using my imperfect model):
| Code: |
90% confidence: 318 points for Houdini.
95% confidence: 321 points for Houdini.
98% confidence: 324 points for Houdini.
99% confidence: 326.5 points for Houdini.
99.5% confidence: 328.5 points for Houdini. |
So, Houdini is better with more than 99% confidence and less than 99.5% confidence after these 606 games! I guess that Houdini is too much Houdini...
In other update I posted the following info:
| Code: |
Write down the confidence level (in percentage) between 75% and 99.9%:
95
Calculating...
Theoretical minimum score for no regression: 53.5564 %
Theoretical standard deviation in this case: 1.8145 % |
This standard deviation is just one standard deviation (roundings included); as 95% confidence is more less 1.96-sigma confidence, you can see that (53.5564 - 50)/1.8145 ~ 1.96... but I should print directly 3.5564% instead of 1.8145% for avoiding confusions, just as I do in Elo_uncertainties_calculator, where I found my last rounding error (my bad), so I go for a very short fix: only two characters (add k*)!
| Code: |
1d-4*nint(1d6*sigma(5),KIND=3) ! The one that can bring confusion.
1d-4*nint(1d6*k*sigma(5),KIND=3) ! The best choice. |
Thank you very much for this match. I stay tuned for the next update!
Regards from Spain.
Ajedrecista. _________________ Six Fortran 95 tools.
Chess will never be solved. |
|
| Back to top |
|
 |
George Speight

Joined: 10 Mar 2006 Posts: 4636
|
Posted: Wed Jun 13, 2012 9:45 pm Post subject: Re: TUESDAY UPDATE - 40x(2) vs. Houdini 2.0c! |
|
|
| Ajedrecista wrote: |
Hi George:
| geots wrote: |
Houdini 2.0c x64 v Engine 40x(2) - UPDATE 6
This update takes us thru game 606. Meaning 83 games have been added since yesterday's update. Houdini increased his lead by 5 games- to +50 games.
Since I could not access the databases, and it would have been 3 crosstables anyway- I went ahead and computed the elo difference here myself.
| Code: |
Houdini 2.0c x64 +29 +188/-138/=280 54.13% 328.0/606
Engine 40x(2) -29 +138/-188/=280 45.87% 278.0/606 |
Interesting that before I started this match- 606 games and counting ago- I told a couple people that with an error factor of + or - 5 elo- what the elo difference of these 2 engines were. With a couple elo change in a particular direction in the next 394 games- I could easily be dead-on the mark. But that is certainly nothing to brag about- it isn't like this is brain surgery.
Until tomorrow-
george |
I guess that you thought Houdini ahead 40x(2) in (+25 ± 5 ) Elo, that is, between +20 Elo and +30 Elo. If I am right, those Elo advantages will be reached between 529 - 471 and 543 - 457 (Houdini wins, of course). In other words, Houdini should score between 201/394 (~ 51.02%) and 215/394 (~ 54.57%) to be between +20 Elo and +30 Elo ahead after 1000 games... who knows?
Here are my error bars for this match update:
| Code: |
Elo_uncertainties_calculator, ® 2012.
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
188
Write down the number of loses:
138
Write down the number of draws:
280
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 18.40 Elo
Upper rating difference: 39.12 Elo
Lower bound uncertainty: -10.33 Elo
Upper bound uncertainty: 10.39 Elo
Average error: +/- 10.36 Elo
K = (average error)*[sqrt(n)] = 255.02
Elo interval: ] 18.40, 39.12[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: 8.10 Elo
Upper rating difference: 49.57 Elo
Lower bound uncertainty: -20.64 Elo
Upper bound uncertainty: 20.84 Elo
Average error: +/- 20.74 Elo
K = (average error)*[sqrt(n)] = 510.51
Elo interval: ] 8.10, 49.57[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: 28.73 Elo
Lower rating difference: -2.19 Elo
Upper rating difference: 60.12 Elo
Lower bound uncertainty: -30.92 Elo
Upper bound uncertainty: 31.39 Elo
Average error: +/- 31.15 Elo
K = (average error)*[sqrt(n)] = 766.93
Elo interval: ] -2.19, 60.12[
---------------------------------------
Number of games of the match: 606
Score: 54.13 %
Elo rating difference: 28.73 Elo
Draw ratio: 46.20 %
**********************************************
1 sigma: 1.4803 % of the points of the match.
2 sigma: 2.9605 % of the points of the match.
3 sigma: 4.4408 % of the points of the match.
**********************************************
End of the calculations.
Thanks for using Elo_uncertainties_calculator. Press Enter to exit.
|
I have just refined this programme in the last part of the code (2 sigma ~ 2.9605%, etc.) because I realized that:
| Code: |
2d-4*nint(1d6*sigma,KIND=3)
3d-4*nint(1d6*sigma,KIND=3) |
Is not the same as:
| Code: |
1d-4*nint(2d6*sigma,KIND=3)
1d-4*nint(3d6*sigma,KIND=3) |
The correct one is the last code box (that is now in the code of my programme); the first one is a bad rounding (one more!), so I noticed some little strange things; but I think that finally all is OK.
Regarding the minimum score for avoiding negative Elo gains with a given confidence interval, this is what I get for this update (using my imperfect model):
| Code: |
90% confidence: 318 points for Houdini.
95% confidence: 321 points for Houdini.
98% confidence: 324 points for Houdini.
99% confidence: 326.5 points for Houdini.
99.5% confidence: 328.5 points for Houdini. |
So, Houdini is better with more than 99% confidence and less than 99.5% confidence after these 606 games! I guess that Houdini is too much Houdini...
In other update I posted the following info:
| Code: |
Write down the confidence level (in percentage) between 75% and 99.9%:
95
Calculating...
Theoretical minimum score for no regression: 53.5564 %
Theoretical standard deviation in this case: 1.8145 % |
This standard deviation is just one standard deviation (roundings included); as 95% confidence is more less 1.96-sigma confidence, you can see that (53.5564 - 50)/1.8145 ~ 1.96... but I should print directly 3.5564% instead of 1.8145% for avoiding confusions, just as I do in Elo_uncertainties_calculator, where I found my last rounding error (my bad), so I go for a very short fix: only two characters (add k*)!
| Code: |
1d-4*nint(1d6*sigma(5),KIND=3) ! The one that can bring confusion.
1d-4*nint(1d6*k*sigma(5),KIND=3) ! The best choice. |
Thank you very much for this match. I stay tuned for the next update!
Regards from Spain.
Ajedrecista. |
Thank you for your work. I won't pretend I understand your methods, but I am quite sure it is accurate knowing your ability. I really do appreciate your interest. All I really said before the match that from what I had seen from 40x- I felt like 40x(2), its upgrade- would be 32 elo weaker than Houdini- with a + and - error bar of 5 elo. From my point of view- the biggest question I felt was exactly how much of an improvement 40x(2) would be over 40x. If correct- my observations could easily have been just a lucky guess.
I am not sure if there will be a Wed. update here or not. Windows wanted to do a restart to effectively install its updates, and even tho those security updates can be very important at times- I put him off about as long as I could. I finally clicked on "hold off for 4 more hours" and went to bed. Naturally when I got back no games were playing. I haven't yet checked to see how many have been played since the 606 game mark, but quite likely not enough for an update. I shall see. And again, thank you.
Best,
george |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|