I am testing Stockfish Vs Stockfish. In the hope we can answer the question. What is the best setting for Stockfish. Does 8 logical cores really out perform 4 real cores, when Stockfish is playing Stockfish? As it does against other programs. Is stockfish better because of better MP?
I will run this test to a unknown length, until one setting can get outside the error bars of the other setting at 99.7% and shows clear superiority.
i7 840 cpu 4 cores
512MB hash
5 stone TB
GM book to 8 moves, same book for both stockfish versions.
Stockfish 070114 (a) 8CPU Idle sleeping threads enabled.
Stockfish 070114 4CPU Idle sleeping threads enabled.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
ernest wrote:Whether you like it or not, I find this test interesting!
That is ok, I find it interesting also.
There is no problem between us.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Best setting for Stockfish on a i7 4 core system. All testing with stockfish in my testing indicates that Stockfish can use logical cores successfully up to 8 threads with measurable gains.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
I have never understood how the Fritz GUI arrives to such indications!
This 99.7%->[ +6, +83] or 3SD error-bar is completely skewed with respect to its center, which is +27
Actually, my calculation
from +55/=235/-30 53.91% 172.5/320
is:
3.91 x 7 = +27 Elo indeed
and SD = [sqrt (55+30)]/2/320 = 1.44% or 10 Elo
So for me, the 3SD error-bar is: [-3, +57] of course symmetric with respect to +27
Am I wrong?
Note: the approximations used in my calculation are valid because the score is not far from 50%
I also never understand how ChessBase GUI reach those results... it probably does not use a normal distribution but other one. I get the following result with my own tool:
LOS_and_Elo_uncertainties_calculator, ® 2012-2013.
----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Maximum number of games supported: 2147483647.
Write down the number of wins (up to 1825361100):
55
Write down the number of loses (up to 1825361100):
30
Write down the number of draws (up to 2147483562):
235
Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):
99.73
Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:
3
---------------------------------------
Elo interval for 99.73 % confidence:
Elo rating difference: 27.20 Elo
Lower rating difference: -2.59 Elo
Upper rating difference: 57.39 Elo
Lower bound uncertainty: -29.78 Elo
Upper bound uncertainty: 30.19 Elo
Average error: +/- 29.99 Elo
K = (average error)*[sqrt(n)] = 536.43
Elo interval: ] -2.59, 57.39[
---------------------------------------
Number of games of the match: 320
Score: 53.91 %
Elo rating difference: 27.20 Elo
Draw ratio: 73.44 %
************************************************************************
Sample standard deviation: 1.4261 % of the points of the match.
3.0000 sample standard deviations: 4.2784 % of the points of the match.
(Corresponding to 99.73 % confidence).
************************************************************************
Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.
-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------
LOS (taking into account draws) is always calculated, if possible.
LOS (not taking into account draws) is only calculated if wins + loses < 16001.
LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________
LOS: 99.69 % (taking into account draws).
LOS: 99.67 % (not taking into account draws).
LOS: 99.68 % (average value).
______________________________________________
These values of LOS are rounded up to 0.01%
End of the calculations. Approximated elapsed time: 97 ms.
Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
That is, circa (+27.2 ± 30) Elo for 3-sigma confidence. I get a little less than your 1.44% of sigma, surely due to a score of near 54%-46% and not 50%-50%. But I agree with your result: if I round my bounds to the closest integers, our bounds match perfectly (-3 and +57).
I have never understood how the Fritz GUI arrives to such indications!
This 99.7%->[ +6, +83] or 3SD error-bar is completely skewed with respect to its center, which is +27
Actually, my calculation
from +55/=235/-30 53.91% 172.5/320
is:
3.91 x 7 = +27 Elo indeed
and SD = [sqrt (55+30)]/2/320 = 1.44% or 10 Elo
So for me, the 3SD error-bar is: [-3, +57] of course symmetric with respect to +27
Am I wrong?
Note: the approximations used in my calculation are valid because the score is not far from 50%
I also never understand how ChessBase GUI reach those results... it probably does not use a normal distribution but other one. I get the following result with my own tool:
LOS_and_Elo_uncertainties_calculator, ® 2012-2013.
----------------------------------------------------------------
Calculation of Elo uncertainties in a match between two engines:
----------------------------------------------------------------
(The input and output data is referred to the first engine).
Please write down non-negative integers.
Maximum number of games supported: 2147483647.
Write down the number of wins (up to 1825361100):
55
Write down the number of loses (up to 1825361100):
30
Write down the number of draws (up to 2147483562):
235
Write down the confidence level (in percentage) between 65% and 99.9% (it will be rounded up to 0.01%):
99.73
Write down the clock rate of the CPU (in GHz), only for timing the elapsed time of the calculations:
3
---------------------------------------
Elo interval for 99.73 % confidence:
Elo rating difference: 27.20 Elo
Lower rating difference: -2.59 Elo
Upper rating difference: 57.39 Elo
Lower bound uncertainty: -29.78 Elo
Upper bound uncertainty: 30.19 Elo
Average error: +/- 29.99 Elo
K = (average error)*[sqrt(n)] = 536.43
Elo interval: ] -2.59, 57.39[
---------------------------------------
Number of games of the match: 320
Score: 53.91 %
Elo rating difference: 27.20 Elo
Draw ratio: 73.44 %
************************************************************************
Sample standard deviation: 1.4261 % of the points of the match.
3.0000 sample standard deviations: 4.2784 % of the points of the match.
(Corresponding to 99.73 % confidence).
************************************************************************
Error bars were calculated with two-sided tests; values are rounded up to 0.01 Elo, or 0.01 in the case of K.
-------------------------------------------------------------------
Calculation of likelihood of superiority (LOS) in a one-sided test:
-------------------------------------------------------------------
LOS (taking into account draws) is always calculated, if possible.
LOS (not taking into account draws) is only calculated if wins + loses < 16001.
LOS (average value) is calculated only when LOS (not taking into account draws) is calculated.
______________________________________________
LOS: 99.69 % (taking into account draws).
LOS: 99.67 % (not taking into account draws).
LOS: 99.68 % (average value).
______________________________________________
These values of LOS are rounded up to 0.01%
End of the calculations. Approximated elapsed time: 97 ms.
Thanks for using LOS_and_Elo_uncertainties_calculator. Press Enter to exit.
That is, circa (+27.2 ± 30) Elo for 3-sigma confidence. I get a little less than your 1.44% of sigma, surely due to a score of near 54%-46% and not 50%-50%. But I agree with your result: if I round my bounds to the closest integers, our bounds match perfectly (-3 and +57).
Regards from Spain.
Ajedrecista.
I noticed you have different LOS value with or without draws. Draws don't affect LOS at all, so your calculation with draws is probably wrong.
Exact value of 1SD is 1.423907% and you also have a small error in its calculation.