Hello:
I have compared two formulæ for calculating standard deviations: the one I usually use:
Code: Select all
sd = sqrt{(1/n)·[mu·(1 - mu) - D/4]}
And other that I found thanks to
this recent post by user
Ruxy Sylwyka.
http://u.cs.biu.ac.il/~koppel/papers/expertga-oct21.pdf
Code: Select all
s = sqrt{[1/(n - 1)]·[W·(1 - mu)² + D·(1/2 - mu)² + L·mu²]}
It can be found in the last page of a 16-page PDF. It is also here (second post):
http://www.open-aurec.com/wbforum/viewtopic.php?t=949
If we do not take into account (1/n) and [1/(n - 1)] (which are very similar when n grows), here is my comparison:
n = number of games
w = number of won games
d = number of drawn games
l = number of lost games
n = w + d + l
W = w/n ; D = d/n ; L = l/n
W + D + L = 1
mu = (w + d/2)/n = W + D/2
1 - mu = (d/2 + l)/n = D/2 + L
Comparison without square roots, (1/n) and [1/(n - 1)]:
W·(1 - mu)² + D·(1/2 - mu)² + L·mu² = mu·(1 - mu) - D/4
W·(1 - 2·mu + mu²) + D·(1/4 - mu + mu²) + L·mu² = mu - mu² - D/4
W - 2·mu·W + mu²·W + D/4 - mu·D + mu²·D + L·mu² = mu - mu² - D/4
mu²·(W + D + L + 1) + mu·(-2·W - D - 1) + W + D/2 = 0
mu²·(1 + 1) + mu·(-2·W - D - 1) + mu = 0
2·mu² - mu·(2W + D) = 0
2·mu² - 2·mu² = 0
If I am not wrong, the only difference between these two formulæ is (1/n) and [1/(n - 1)].
In the paper, W is the number of won games, while I have used the win ratio (the same for D and L). I have done this because the standard deviation of the paper is abnormally large, so I suppose that it was a little error. An example:
Code: Select all
n = 100 (+40 = 30 - 30)
mu = 0.55
s = sqrt{(1/99)·[40·(0.45)² + 30·(-0.05)² + 30·(0.55)²]} ~ 0.41742 ; (1.96)·s ~ 0.81815
95% confidence ~ 1.96-sigma: mu ± (1.96)·s ~ [-0.26815, 1.36815] (Strange for me).
-------------------------------------------------
sd = sqrt{(1/n)·[mu·(1 - mu) - D/4]}
rd(+) = 400·log{[mu + k·(sd)]/[1 - mu - k·(sd)]}
rd(-) = 400·log{[mu - k·(sd)]/[1 - mu + k·(sd)]}
e(+) = [rd(+)] - rd > 0
e(-) = [rd(-)] - rd < 0
<e> = ±[|e(+)| + |e(-)|]/2 = ±{[e(+)] - [e(-)]}/2
These are part of my calculations; regarding <e>, it can be calculated in this way (just operating with the properties of logarithms):
<e> = ± 200·log{[mu + k·(sd)][1 - mu + k·(sd)]/[mu - k·(sd)][1 - mu - k·(sd)]}
Where k gives the confidence level (k ~ 1.96 for 95% confidence, k = 2 for ~ 95.45% confidence...). Comments and/or corrections are welcome.
Regards from Spain.
Ajedrecista.