Chess Statistics

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Laskos wrote:
Edmund wrote:...

Lets calculate an example and you tell me please where I am going wrong:

Test A: 1/1/0
Test B: 1/2/0

A)
win_prob = 1 / 4
draw_prob = 1 / 2

P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L
= 2! / (0! 2! 0!) 0.25^0 * 0.5^2 * 0.25^0 = 0.5^2 = 0.25
P(score_dif = 1) = 2! / (1! 1! 0!) 0.25^1 * 0.5^1 * 0.25^0 = 2 * 0.25 * 0.5 = 0.25

LOS ( A ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 62.5%

B)
win_prob = 1 / 6
draw_prob = 2 / 3

P(score_dif = 0) = 3! / (0! 3! 0!) (1/6)^0 * (2/3)^3 * (1/6)^0 = (2/3)^3 = 0.296296296
P(score_dif = 1) = 3! / (1! 2! 0!) (1/6)^1 * (2/3)^2 * (1/6)^0 = (1/6) * (2/3)^2 = 0.222222222

LOS ( B ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 59.26%

---

So the LOS for the Test B is lower although the score difference is equal.
Sorry, I didn't quite get your win_prob and draw_prob. Besides that, your
P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L assumes already a normal distribution. If you would get a sum over trinomials, that would get you to correct result. Try to compare to the precise formula.

Kai
Lets take Example B)
W = 1
D = 2
L = 0

so having just this prior information one would assume a draw ratio between those two engine of 2/3;
That means 1/3 of the games have to be non draws;
as the calculation assumes two equally strong opponents, the win ratio for each of them will be the remaining 1/3 divided by 2 = 1/6

in what sense does P(score_dif = 0) assume a normal distribution. I was giving the formula for a trinomial distribution


regarding the test A where N=2

I see now that I made a mistake in the calculation .. the correct should be:

P(score_dif = -2) =
P(W=0 D=0 L=2) = 2! / (0! 0! 2!) * 0.25^2 = 0.25^2

P(score_dif = -1) =
P(W=0 D=1 L=1) = 2! / (0! 1! 1!) * 0.5 * 0.25 = 0.25

P(score_dif = 0) =
P(W=0 D=2 L=0) = 2! / (0! 2! 0!) * 0.5^2 = 0.25
+P(W=1 D=0 L=1) = 2! / (1! 0! 1!) * 0.25^2 = 2 * 0.25^2

P(score_dif = 1) =
P(W=1 D=1 L=0) = 2! / (1! 1! 0!) * 0.25 * 0.5 = 0.25

P(score_dif = 2) =
P(W=2 D=0 L=0) = 2! / (2! 0! 0!) * 0.25^2 = 0.25^2

the sum of all those = 1:
0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 + 0.25^2 = 1

and to get the LOS =
P(score_dif = -2) + P(score_dif = -1) + P(score_dif = 0) + P(score_dif = 1) / 2 = 0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 / 2 = 81.25%
Last edited by Edmund on Fri Jun 18, 2010 12:04 am, edited 1 time in total.
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

And now the same for Example B)

P(score_dif = -3) =
P(W=0 D=0 L=3) = 3! / (0! 0! 3!) * (1/6)^0 * (2/3)^0 * (1/6)^3 = 1 * (1/6)^3

P(score_dif = -2) =
P(W=0 D=1 L=2) = 3! / (0! 1! 2!) * (1/6)^0 * (2/3)^1 * (1/6)^2 = 3 * (2/3) * (1/6)^2

P(score_dif = -1) =
P(W=1 D=0 L=2) = 3! / (1! 0! 2!) * (1/6)^1 * (2/3)^0 * (1/6)^2 = 3 * (1/6)^3
+P(W=0 D=2 L=1) = 3! / (0! 2! 1!) * (1/6)^0 * (2/3)^2 * (1/6)^1 = 3 * (2/3)^2 * (1/6)

P(score_dif = 0) =
P(W=1 D=1 L=1) = 3! / (1! 1! 1!) * (1/6)^1 * (2/3)^1 * (1/6)^1 = 6 * (1/6)^2 * (2/3)
+P(W=0 D=3 L=0) = 3! / (0! 3! 0!) * (1/6)^0 * (2/3)^3 * (1/6)^0 = 1 * (2/3)^3

P(score_dif = 1) = P(score_dif = -1)
P(score_dif = 2) = P(score_dif = -2)
P(score_dif = 3) = P(score_dif = -3)

and the sum of all these again:
2 * ((1/6)^3 + 3 * (2/3) * (1/6)^2 + 3 * (1/6)^3 + 3 * (2/3)^2 * (1/6)) + 6 * (1/6)^2 * (2/3) + (2/3)^3 = 1

and the LOS = P(score_dif = -3) + P(score_dif = -2) + P(score_dif = -1) + P(score_dif = 0) + P(score_dif = 1) / 2
= 82.17592 %
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

Edmund wrote:
Laskos wrote:
Edmund wrote:...

Lets calculate an example and you tell me please where I am going wrong:

Test A: 1/1/0
Test B: 1/2/0

A)
win_prob = 1 / 4
draw_prob = 1 / 2

P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L
= 2! / (0! 2! 0!) 0.25^0 * 0.5^2 * 0.25^0 = 0.5^2 = 0.25
P(score_dif = 1) = 2! / (1! 1! 0!) 0.25^1 * 0.5^1 * 0.25^0 = 2 * 0.25 * 0.5 = 0.25

LOS ( A ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 62.5%

B)
win_prob = 1 / 6
draw_prob = 2 / 3

P(score_dif = 0) = 3! / (0! 3! 0!) (1/6)^0 * (2/3)^3 * (1/6)^0 = (2/3)^3 = 0.296296296
P(score_dif = 1) = 3! / (1! 2! 0!) (1/6)^1 * (2/3)^2 * (1/6)^0 = (1/6) * (2/3)^2 = 0.222222222

LOS ( B ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 59.26%

---

So the LOS for the Test B is lower although the score difference is equal.
Sorry, I didn't quite get your win_prob and draw_prob. Besides that, your
P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L assumes already a normal distribution. If you would get a sum over trinomials, that would get you to correct result. Try to compare to the precise formula.

Kai
Lets take Example B)
W = 1
D = 2
L = 0

so having just this prior information one would assume a draw ratio between those two engine of 2/3;
That means 1/3 of the games have to be non draws;
as the calculation assumes two equally strong opponents, the win ratio for each of them will be the remaining 1/3 divided by 2 = 1/6
I do not see a reason why you shouldn't take the information that 1/3 were Wins and 0 Losses. If you do not take this into account, your score is 0.5 with different error margins.

in what sense does P(score_dif = 0) assume a normal distribution. I was giving the formula for a trinomial distribution
In the sense that you are adding or substracting probabilities for related results.

regarding the test A where N=2

I see now that I made a mistake in the calculation .. the correct should be:

P(score_dif = -2) =
P(W=0 D=0 L=2) = 2! / (0! 0! 2!) * 0.25^2 = 0.25^2

P(score_dif = -1) =
P(W=0 D=1 L=1) = 2! / (0! 1! 1!) * 0.5 * 0.25 = 0.25

P(score_dif = 0) =
P(W=0 D=2 L=0) = 2! / (0! 2! 0!) * 0.5^2 = 0.25
+P(W=1 D=0 L=1) = 2! / (1! 0! 1!) * 0.25^2 = 2 * 0.25^2

P(score_dif = 1) =
P(W=1 D=1 L=0) = 2! / (1! 1! 0!) * 0.25 * 0.5 = 0.25

P(score_dif = 2) =
P(W=2 D=0 L=0) = 2! / (2! 0! 0!) * 0.25^2 = 0.25^2

the sum of all those = 1:
0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 + 0.25^2 = 1

and to get the ELO =
P(score_dif = -2) + P(score_dif = -1) + P(score_dif = 0) + P(score_dif = 1) / 2 = 0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 / 2 = 81.25%

Sorry, I did not quite follow, but if you made the numbers of draws the only sure thing, then I do not know what to tell.

You seem to have added P(score_dif = 1)/2, whereas you should have added P(score_dif = 0)/2 only. I understand, the score was +1, but you are using this "a posteriori" rather than "a priori", when all the probabilities are already calculated for equal engines. By the score they are not equal, the probabilities are not equal, the error intervals are not equal, the LOS is not equal.

Try to check with my precise formula, I am even curious, but have to go to sleep (sometime in the future :) )

I am very skeptical about your 81.25% LOS for 1/1/0. That is really odd.

Kai
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

The precise formula for 1/1/0 gives

1 - (Binomial[3, 0] + Binomial[2, 0])/2^3=0.75 LOS

More reasonable.

Kai

ps We really should not do tests on 2 games, as my old university professor of Thermodynamics and Statistics told me, there is no statistics below 9.

Kai
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Laskos wrote:The precise formula for 1/1/0 gives

1 - (Binomial[3, 0] + Binomial[2, 0])/2^3=0.75 LOS
Ok enough of that "precise" formula mambo jambo.
There cannot be precise formula for trinomial statistics using binomial formulas. The draws have the impact, even though very small, and that obviously makes you believe your formula is correct. It's not.
It's just a simplified case where you assume there are no draws and probability of win or loss is 0.5 exactly.

Just to demonstrate that your "precise" formula is actually inaccurate, take the following example +38/=32/-30.
Using your formula you get LOS=83.22%
By using Edmund's normal distribution approximation, you get LOS=83.40%, while the exact value (from the LOS table accurately calculated following the only true trinomial formula for the fixed draw rate of 32%) is LOS=83.34%.

As you can see, your "precise" formula is more inaccurate than Edmund's normal approximation.
Now if you still don't believe it, run a couple of million MC iterations to get the appropriate accuracy, I have really no more to say...
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Chess Statistics

Post by Milos »

Just to give a little secret, your precise formula is nothing but:
(w+l)!/w!/l!*0.5^w*0.5^l where you take values for which w-l is bigger than the score.
For the case when w-l is exactly equal to the score you take 1/2 of the value making the simplest interpolation.
So your "precise" formula is exactly equivalent to:
LOS = 1 - (binomial(a+c,0) + binomial(a+c,1) + ... + binomial(a+c,c-1) + 0.5*binomial(a+c,c)) / 2^(a+c)

Please try to come up with something more original next time ;).
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: Chess Statistics

Post by Edmund »

Laskos wrote:
Edmund wrote:
Laskos wrote:
Edmund wrote:...

Lets calculate an example and you tell me please where I am going wrong:

Test A: 1/1/0
Test B: 1/2/0

A)
win_prob = 1 / 4
draw_prob = 1 / 2

P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L
= 2! / (0! 2! 0!) 0.25^0 * 0.5^2 * 0.25^0 = 0.5^2 = 0.25
P(score_dif = 1) = 2! / (1! 1! 0!) 0.25^1 * 0.5^1 * 0.25^0 = 2 * 0.25 * 0.5 = 0.25

LOS ( A ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 62.5%

B)
win_prob = 1 / 6
draw_prob = 2 / 3

P(score_dif = 0) = 3! / (0! 3! 0!) (1/6)^0 * (2/3)^3 * (1/6)^0 = (2/3)^3 = 0.296296296
P(score_dif = 1) = 3! / (1! 2! 0!) (1/6)^1 * (2/3)^2 * (1/6)^0 = (1/6) * (2/3)^2 = 0.222222222

LOS ( B ) = 1 - (P(score_dif = 0) + P(score_dif = 1) /2) = 59.26%

---

So the LOS for the Test B is lower although the score difference is equal.
Sorry, I didn't quite get your win_prob and draw_prob. Besides that, your
P(score_dif = 0) = N! / (W! D! L!) win_prob^W draw_prob^D win_prob^L assumes already a normal distribution. If you would get a sum over trinomials, that would get you to correct result. Try to compare to the precise formula.

Kai
Lets take Example B)
W = 1
D = 2
L = 0

so having just this prior information one would assume a draw ratio between those two engine of 2/3;
That means 1/3 of the games have to be non draws;
as the calculation assumes two equally strong opponents, the win ratio for each of them will be the remaining 1/3 divided by 2 = 1/6
I do not see a reason why you shouldn't take the information that 1/3 were Wins and 0 Losses. If you do not take this into account, your score is 0.5 with different error margins.
Seems we are talking about something completely different then. LOS for me as stated is the likelihood that two equally strong engines reach a certain score.

Laskos wrote:

in what sense does P(score_dif = 0) assume a normal distribution. I was giving the formula for a trinomial distribution
In the sense that you are adding or substracting probabilities for related results.

regarding the test A where N=2

I see now that I made a mistake in the calculation .. the correct should be:

P(score_dif = -2) =
P(W=0 D=0 L=2) = 2! / (0! 0! 2!) * 0.25^2 = 0.25^2

P(score_dif = -1) =
P(W=0 D=1 L=1) = 2! / (0! 1! 1!) * 0.5 * 0.25 = 0.25

P(score_dif = 0) =
P(W=0 D=2 L=0) = 2! / (0! 2! 0!) * 0.5^2 = 0.25
+P(W=1 D=0 L=1) = 2! / (1! 0! 1!) * 0.25^2 = 2 * 0.25^2

P(score_dif = 1) =
P(W=1 D=1 L=0) = 2! / (1! 1! 0!) * 0.25 * 0.5 = 0.25

P(score_dif = 2) =
P(W=2 D=0 L=0) = 2! / (2! 0! 0!) * 0.25^2 = 0.25^2

the sum of all those = 1:
0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 + 0.25^2 = 1

and to get the ELO =
P(score_dif = -2) + P(score_dif = -1) + P(score_dif = 0) + P(score_dif = 1) / 2 = 0.25^2 + 0.25 + 0.25 + 2 * 0.25^2 + 0.25 / 2 = 81.25%

Sorry, I did not quite follow, but if you made the numbers of draws the only sure thing, then I do not know what to tell.
first step: describe the curve: P(score)
second step: calculate P(score < x)
Laskos wrote:
You seem to have added P(score_dif = 1)/2, whereas you should have added P(score_dif = 0)/2 only. I understand, the score was +1, but you are using this "a posteriori" rather than "a priori", when all the probabilities are already calculated for equal engines. By the score they are not equal, the probabilities are not equal, the error intervals are not equal, the LOS is not equal.
if you take half the area under the curve of a multinominal distribution the result will be 0.5
Laskos wrote:

Try to check with my precise formula, I am even curious, but have to go to sleep (sometime in the future :) )

I am very skeptical about your 81.25% LOS for 1/1/0. That is really odd.

Kai
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

Milos wrote:Just to give a little secret, your precise formula is nothing but:
(w+l)!/w!/l!*0.5^w*0.5^l where you take values for which w-l is bigger than the score.
For the case when w-l is exactly equal to the score you take 1/2 of the value making the simplest interpolation.
So your "precise" formula is exactly equivalent to:
LOS = 1 - (binomial(a+c,0) + binomial(a+c,1) + ... + binomial(a+c,c-1) + 0.5*binomial(a+c,c)) / 2^(a+c)

Please try to come up with something more original next time ;).
Your final formula for LOS is not exactly as mine, precise one, please check. Until you will not understand that LOS does not depend on number of draws (as your own results already hint to!!!), the discussion is useless.

A little more math will help you, until then, good luck with your 0.32 draws, 0.5 draws or 0 draws.

Kai
Last edited by Laskos on Fri Jun 18, 2010 10:27 am, edited 1 time in total.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

Edmund wrote: Seems we are talking about something completely different then. LOS for me as stated is the likelihood that two equally strong engines reach a certain score.
???
Yes, we are talking about different things. Your thing seems odd to me.
if you take half the area under the curve of a multinominal distribution the result will be 0.5
Yes, if 0 means average of the scores with w=l (without a priori knowledge, again, strange), again a misunderstanding. My trinomials are not centered on 0 and my engines are not equal. By the way, are your trinomials symmetric? If w=l, they are.

Kai
Last edited by Laskos on Fri Jun 18, 2010 10:38 am, edited 1 time in total.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Chess Statistics

Post by Laskos »

I will repost in bold the

PRECISE FORMULA FOR LOS

For any match +a =b -c, the formula which gives the exact LOS

LOS = 1 - ( binomial( a+c+2,0) + binomial( a+c+2,1) + ... + binomial( a+c+2, c) + binomial(a+c+1,c) ) / 2^(a+c+2)


For those without prejudices, it will help a lot.

Kai