Hello!
geots wrote:Houdini 2.0c x64 vs Rainbow 1.0 beta
At the 200 game mark- Houdini has increased his lead by 3 games. But interestingly enough- the elo difference remains the same. So Houdini has increased his advantage, or beta 1 is still holding steady- whichever way of looking at it floats your boat.
Intel i5 w/4TCs
Shredder 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
HS-Book 2.0 bkt. w/12-move limit
10'+10"
Match=500 games
[after 200 games]
Code: Select all
Houdini 2.0c x64 +33 +59/-40/=101 54.75% 109.5/200
Rainbow 1.0 beta -33 +40/-59/=101 45.25% 90.5/200
Now I am not quite sure the next update will take us to the halfway mark. Maybe- maybe not. So there is way more than enough time left for beta 1 to make a couple of nice runs. OTOH, making up 19 games against Houdini- that can be problematic.
Hopefully Jesus will stop by and give us some insight into possibilities.
In search of another match update-
george
Rainbow is getting into a little problem... other form to say the same is that Houdini is too much Houdini! And the new Houdini 3 is closer and it will not be worse than Houdini 2.0c... Here are my results (remember, take them with lots of care!):
Code: Select all
LOS_and_Elo_uncertainties_calculator, ® 2012.
----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Write down the number of wins:
59
Write down the number of loses:
40
Write down the number of draws:
101
Write down the clock rate of the CPU (in GHz), only for timing the elapsed time
of the calculations:
3
(Only 1, 2 and 3-sigma confidence error bars are calculated, if possible).
***************************************
1-sigma confidence ~ 68.27% confidence.
2-sigma confidence ~ 95.45% confidence.
3-sigma confidence ~ 99.73% confidence.
***************************************
---------------------------------------
Elo interval for 1-sigma confidence:
Elo rating difference: 33.11 Elo
Lower rating difference: 15.89 Elo
Upper rating difference: 50.49 Elo
Lower bound uncertainty: -17.22 Elo
Upper bound uncertainty: 17.38 Elo
Average error: +/- 17.30 Elo
K = (average error)*[sqrt(n)] = 244.62
Elo interval: ] 15.89, 50.49[
---------------------------------------
Elo interval for 2-sigma confidence:
Elo rating difference: 33.11 Elo
Lower rating difference: -1.25 Elo
Upper rating difference: 68.12 Elo
Lower bound uncertainty: -34.35 Elo
Upper bound uncertainty: 35.01 Elo
Average error: +/- 34.68 Elo
K = (average error)*[sqrt(n)] = 490.49
Elo interval: ] -1.25, 68.12[
---------------------------------------
Elo interval for 3-sigma confidence:
Elo rating difference: 33.11 Elo
Lower rating difference: -18.39 Elo
Upper rating difference: 86.11 Elo
Lower bound uncertainty: -51.50 Elo
Upper bound uncertainty: 53.00 Elo
Average error: +/- 52.25 Elo
K = (average error)*[sqrt(n)] = 738.90
Elo interval: ] -18.39, 86.11[
---------------------------------------
Number of games of the match: 200
Score: 54.75 %
Elo rating difference: 33.11 Elo
Draw ratio: 50.50 %
**********************************************
1 sigma: 2.4647 % of the points of the match.
2 sigma: 4.9294 % of the points of the match.
3 sigma: 7.3941 % of the points of the match.
**********************************************
Error bars were calculated with two-sided tests; values are rounded up to 0.01
Elo, or 0.01 in the case of K.
-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------
LOS: 97.30 %
This value of LOS is rounded up to 0.01%
End of the calculations. Approximated elapsed time: 56 ms.
Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
A LOS value of ~ 97.3% means that Houdini should not be better than Rainbow only in more less 1/37 of the cases! So, Houdini is very likely to be better than Rainbow, although the match is still on its 40%. A very difficult task for Rainbow...
Regarding error bars, Houdini has the lead with ~ +33 ± 35 Elo (with 2-sigma confidence), so it is still a little soon for making some predictions, but Houdini clearly has the edge in my unexperienced POV. I take this occasion for wish good luck to all the programmers, testers, opening book makers...
Regards from Spain.
Ajedrecista.