Hello Uri:
Uri Blass wrote:geots wrote:Engine 40x(2) vs Houdini 2.0c x64 - 2nd UPDATE
The following update took place on an odd number of games, as the computer restarted itself, also giving me the opportunity to grab the games played so far from the database. Running at a control of 10'+10", time losses are more frequent than in repeating controls- but after checking all the games, they were all in order and fine.
Intel i5 w/4TCs
Fritz 11 gui
1CPU/64bit
128MB hash
Bases=NONE
Ponder_Learning=OFF
Perfect 12.32 book w/12-move limit
10'+10"
Match=1000 games
Code: Select all
1 Houdini 2.0c x64 +48/-30/=63 56.00% 79.5/141
2 Engine 40x(2) +30/-48/=63 44.00% 61.5/141
I guess I may never find out why I received 40x and then an update to 40x- 40x(2), and will very likely never get a chance to see their "big daddy"- the main engine.
But this match is not over quite yet.
george
The result seems convincing.
48-30 not including draws
If I flip a coin 78 times the standard deviation is sqrt(19)<4.5
so Houdini scored more than 2 standard deviation above 50% that is 39.
I am almost sure that houdini is stronger.
Uri
I agree with you; I suppose that the standard deviation is sqrt(19.5) instead sqrt(19) although differences are small:
A little more than two standard deviations in the case of the coin. Regarding this match: draws are not meaningless in the model I use. The result I get is very curious:
Code: Select all
Minimum_score_for_no_regression, ® 2012.
Calculation of the minimum score for no regression in a match between two engines:
Write down the number of games of the match (it must be a positive integer, up to 1073741823):
141
Write down the draw ratio (in percentage):
44.68085106
Write down k (for making confidence intervals of (mu) +/- (k*sigma) in a normal distribution); k must be positive:
1.96
Theoretical minimum score for no regression: 56.0564 %
Minimum number of won points for the engine in this match: 79.5 points.
Minimum Elo advantage, which is also the negative part of the error bar:
44.5968 Elo
End of the calculations.
Thanks for using Minimum_score_for_no_regression. Press Enter to exit.
I get a minimum of 79.5 points with 1.96-sigma confidence (~ 95% confidence) for reach some conclusions, just the same points that Houdini had scored! What a coincidence... life is full of them.
@George: thank you very much for this great test. This Engine 40x(2) is a mystery: single core or multi-core (I know that the test is running in single core), capability (or not) to use EGTB and/or bitbases (I know that engines do not use bases in this match), typical depths and speed that Engine 40x(2) reach in your computer... what a pity if finally Engine 40x(2) remains private. Please keep up the good work!
Regards from Spain.
Ajedrecista.