Hi George!
geots wrote:Houdini 2.0c x64 vs Rainbow 1.0 beta
I thought there was a possibility that if I gave Rainbow another generic book and a gui other than Fritz- it might possibly make a difference. But at the 240 game mark- it was obvious to me that Houdini was not going to be caught in this match. Core time is valuable to me- so I called the coroner:
Intel i5 w/4TCs
Shredder 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
HS-Book 2.0 bkt. w/12-move limit
10'+10"
Match=240 game stoppage*
Code: Select all
Houdini 2.0c x64 +26 +71/-53/=116 53.75% 129.0/240
Rainbow 1.0 beta -26 +53/-71/=116 46.25% 111.0/240
So goes this match- now to the next one:
Houdini 2.0c x64 vs Rainbow Limited- beta 2
Now this was the match where I supposedly put it all on the line. And to be honest- I was mistaken. I thought "Limited- beta 2" had a better than even chance of taking Houdini 2.0c. I was wrong- another call to the coroner:
Intel i5 w/4TCs
Fritz 13 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
5'+5"
Match=330 game stoppage*
Code: Select all
Houdini 2.0c x64 +28 +101/-74/=155 54.09% 178.5/330
Rainbow Limited- beta 2 -28 +74/-101/=155 45.91% 151.5/330
And if you will notice, after 240 games and 330 games- the elo differences in both matches are pretty much the same.
So this brings us up to date- and here is where hopefully there may be a surprise. Notice I said "may". Do not misunderstand me- my life does not hinge on beating Houdini. Rather I want to provide you with the most exciting match possible. And what could be more exciting than an opponent taking Houdini right down to the wire in a 300 or 400 game match. I would consider an opponent that could "just keep it very interesting" a success.
Which brings us to the next subject. Stay tuned- don't leave us yet!!
george
Thank you very much for your efforts. Good try by Rainbow! But Houdini is still the king. I ran LOS_and_Elo_uncertainties_calculator and obtained these results:
For Rainbow 1.0 beta, after 240 games:
Code: Select all
LOS_and_Elo_uncertainties_calculator, ® 2012.
----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
53
Write down the number of loses:
71
Write down the number of draws:
116
Write down the clock rate of the CPU (in GHz), only for timing the elapsed time
of the calculations:
3
(Only 1, 2 and 3-sigma confidence error bars are calculated, if possible).
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: -26.11 Elo
Lower rating difference: -42.30 Elo
Upper rating difference: -10.03 Elo
Lower bound uncertainty: -16.19 Elo
Upper bound uncertainty: 16.08 Elo
Average error: +/- 16.13 Elo
K = (average error)*[sqrt(n)] = 249.96
Elo interval: ] -42.30, -10.03[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: -26.11 Elo
Lower rating difference: -58.67 Elo
Upper rating difference: 6.01 Elo
Lower bound uncertainty: -32.57 Elo
Upper bound uncertainty: 32.11 Elo
Average error: +/- 32.34 Elo
K = (average error)*[sqrt(n)] = 501.02
Elo interval: ] -58.67, 6.01[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: -26.11 Elo
Lower rating difference: -75.31 Elo
Upper rating difference: 22.07 Elo
Lower bound uncertainty: -49.21 Elo
Upper bound uncertainty: 48.18 Elo
Average error: +/- 48.69 Elo
K = (average error)*[sqrt(n)] = 754.31
Elo interval: ] -75.31, 22.07[
---------------------------------------
Number of games of the match: 240
Score: 46.25 %
Elo rating difference: -26.11 Elo
Draw ratio: 48.33 %
**********************************************
1 sigma: 2.3072 % of the points of the match.
2 sigma: 4.6145 % of the points of the match.
3 sigma: 6.9217 % of the points of the match.
**********************************************
Error bars were calculated with two-sided tests; values are rounded up to 0.01
Elo, or 0.01 in the case of K.
-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------
LOS: 5.20 %
This value of LOS is rounded up to 0.01%
End of the calculations. Approximated elapsed time: 49 ms.
Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
More less -26 ± 32 Elo with ~ 95.45% confidence; the LOS value calculated by my programme is ~ 5.2%. Using the model of not counting draws for LOS, proposed by Rémi Coulom in
this post (the last equation), I get a LOS value of ~ 5.35% using Derive 6 (I used Rémi's method for check the validity of my calculation).
------------------------
For Rainbow Limited - beta 2, after 330 games:
Code: Select all
LOS_and_Elo_uncertainties_calculator, ® 2012.
----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
74
Write down the number of loses:
101
Write down the number of draws:
155
Write down the clock rate of the CPU (in GHz), only for timing the elapsed time
of the calculations:
3
(Only 1, 2 and 3-sigma confidence error bars are calculated, if possible).
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: -28.49 Elo
Lower rating difference: -42.48 Elo
Upper rating difference: -14.60 Elo
Lower bound uncertainty: -13.99 Elo
Upper bound uncertainty: 13.89 Elo
Average error: +/- 13.94 Elo
K = (average error)*[sqrt(n)] = 253.24
Elo interval: ] -42.48, -14.60[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: -28.49 Elo
Lower rating difference: -56.60 Elo
Upper rating difference: -0.75 Elo
Lower bound uncertainty: -28.11 Elo
Upper bound uncertainty: 27.74 Elo
Average error: +/- 27.93 Elo
K = (average error)*[sqrt(n)] = 507.31
Elo interval: ] -56.60, -0.75[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: -28.49 Elo
Lower rating difference: -70.91 Elo
Upper rating difference: 13.10 Elo
Lower bound uncertainty: -42.42 Elo
Upper bound uncertainty: 41.59 Elo
Average error: +/- 42.01 Elo
K = (average error)*[sqrt(n)] = 763.08
Elo interval: ] -70.91, 13.10[
---------------------------------------
Number of games of the match: 330
Score: 45.91 %
Elo rating difference: -28.49 Elo
Draw ratio: 46.97 %
**********************************************
1 sigma: 1.9917 % of the points of the match.
2 sigma: 3.9833 % of the points of the match.
3 sigma: 5.9750 % of the points of the match.
**********************************************
Error bars were calculated with two-sided tests; values are rounded up to 0.01
Elo, or 0.01 in the case of K.
-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------
LOS: 2.00 %
This value of LOS is rounded up to 0.01%
End of the calculations. Approximated elapsed time: 51 ms.
Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
More less -28 ± 28 Elo with ~ 95.45% confidence; the LOS value calculated by my programme is ~ 2%. Using Rémi's method again, I get a LOS value of ~ 2.08%. IMHO both models give similar results.
------------------------
I wish you very good luck with your new project. I hope that more people can help you! I also wish good luck to programmers. I will stay tuned to your posts.
Regards from Spain.
Ajedrecista.