Perft(13) betting pool

hgm · Post by **hgm** » Fri Jul 15, 2011 10:13 am

Although I have no doubt that starting sampling from the root will have the correct expectation value, I don't think it will have optimal variance. If I take 8000 random paths from the root, they will sample each perft(3) position on average once. But because not every perft(3) position has the same size sub-tree, sampling here will cause some extra error on top of the errors in the sub-tree counts. While it does not buy you anything: you might as well have used each perft(3) position exactly once, if you are doing 8000 samples anyway.

For this reason I started with the 119M perft(6) positions when I want to do ~100M samples, and then sample those (on average) only once.

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 2:35 pm

Yes. I think sampling each perft(3) position is a little better than sampling 8000 root positions due to the extra error introduced by sampling those three plies. This is very clear with the heavy LMR since if the last move at the root get sampled it will be searched with a depth of 1 for a perft(20) and if it was the first one it will have depth=20.
But for regular perft with uniform tree this method seems to be so accurate as to be a potential mood killer for perft generators. If it is really an unbiased estimate and with a good PRNG, it should give very close estimates as seems to be the case with perft(12) result of Peter.

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 2:45 pm

perft(20) with LMR

Code: Select all

perft 20
nodes 1.27406e+007
time 3.09 sec
perft 20
nodes 1.28171e+007
time 3.13 sec
perft 20
nodes 1.3428e+007
time 3.20 sec

Starting from depth 10 with uniform PRNG

Code: Select all

perft 20
nodes 9.21666e+006
time 1.81 sec
perft 20
nodes 9.47911e+006
time 1.84 sec
perft 20
nodes 9.84022e+006
time 1.91 sec

Starting from depth 10 with biased PRNG

Code: Select all

perft 20
nodes 1.07089e+007
time 1.84 sec
perft 20
nodes 1.07698e+007
time 1.88 sec
perft 20
nodes 1.11844e+007
time 1.91 sec

And the terrible result with depth=0 and biased PRNG

perft 20
nodes 8400
time 0.00 sec
perft 20
nodes 7600
time 0.00 sec
perft 20
nodes 400
time 0.00 sec

depth = 3

Code: Select all

nodes 2.31382e+006
time 0.00 sec
perft 20
nodes 793583
time 0.00 sec
perft 20
nodes 9.4891e+006
time 0.00 sec

depth = 5

Code: Select all

perft 20
nodes 1.45449e+007
time 0.06 sec
perft 20
nodes 3.17009e+006
time 0.06 sec
perft 20
nodes 5.00636e+006
time 0.06 sec
perft 20
nodes 3.46153e+006
time 0.06 sec

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 4:02 pm

The question is if we can use these sequences to have an unbiased estimate for perft(13)

E(p(1)*p(2)*p(3)*p(4)...*p(13)) or in the specific case the derived claim is that
sum of perft(3) values of (p(4)*...p(13) for all the possible first 3 plies) is a variable that get expected value of perft(13) and we can use many samples to get a good estimate for the standard deviation of it.

Note that I found that the expected value of the multiplication on a random game is a biased estimate but practically we do not generate a random game by generating random moves and it seems at least based on one example that I tried that it is unbiased estimate but we need to prove it.

I haven't looked into your method Or peter's proof carefully but you are saying your estimate is biased? If p(1),p(2)..p(n) are all independent then E(XYZ..) = E(X)E(Y)... . Does the proof by induction proof that this relation remains valid even for perft(13) when we have dependent p(i) values ? I am just beginning to look at the proof so excuse my late jump in...
Edit: My doubt is if your estimate is biased ,i.e the one you get by doing random games, this one should also be biased because we are doing unfinished games of 13 ply.

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 5:04 pm

My attempt to understand the proof.

Test for smallest perft(1) :
Good since we have have independent p(1) and p(2)

Induction: Lets do this with 3 ply perft
Assume E(XYZ) = E(X)E(Y)E(Z) i.e X , Y, Z are independent

Prove E(XYZD) = E(X)E(Y)E(Z)E(D) where D is dependent .

E(XYZD) = E(XYZ_1) + E(XYZ_2) + E(XYZ_3) + ..... + E(XYZ_D)
where E(XYZ_i) repersents perft(3) after making the i_th move

To finish the proof E(XYZ_1) = E(XYZ_2) = E(XYZ_3) ...should all be equal.
But that may not be the case since each move may produce significantly different sub-trees...
The fact that each of the perft(3) values are independent "with in" i.e E(XYZ) = E(X)E(Y)E(Z) didn't help us here.
If there is independence between D and XYZ's (perft(3)s) then E(XYZD) = E(XYZ)E(D) = D* E(XYZ).
What am I missing ?

Edit:I think if it is indeed an unbiased estimate we should also get the same thing for an LMR perft.
Peter got a 30% off expected value for perft(20) after taking many samples.
I don't think this will improve much with more number of samples.

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 7:48 pm

I am now convinced the it is a biased estimate in principle. For perft it probably is very close to being unbiased because of the uniform tree where
E(XYZ_1), E(XYZ_2)... are more or less equal and since we are using a uniform PRNG to sample games from each subtree.
In general, the PRNG used should produce random numbers proportional to the size of the subtree i.e E(XYZ_i) at each node to get an unbiased estimate. So we can only prove that there exists a set of weights for the PRNG for each move in the tree that will give us an unbiased estimate. The LMR version will definitely give a biased estimate if we still use uniform PRNG.

Michel · Post by **Michel** » Fri Jul 15, 2011 8:15 pm

Edit:I think if it is indeed an unbiased estimate we should also get the same thing for an LMR perft.
Peter got a 30% off expected value for perft(20) after taking many samples.
I don't think this will improve much with more number of samples.

Isnt it clear that the estimate is unbiased? Inductively you are trying to estimate a sum X=X_1+...+X_n (the X_i are constant here) given unbiased estimators x_i for each of the X_i (the x_i are true random variables). If I have read correctly the method is to take i random between 1 and n and to take x=nx_i as the estimate. So the average estimate will be

(1/n)(nx_1+...+nx_n)=x_1+...+x_n

which is an unbiased estimator for X_1+...+X_n.

The standard deviation of x can be computed similarly. It contains
a part given by the standard deviation of (X_1,...,X_n) as
well as a part given by the standard deviation of each of the x_i.

In a non uniform tree the standard deviation of (X_1,...,X_n)
will dominate and this will propagate recursively. So one needs more
samples.

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 8:28 pm

I admit that I am getting a little confused. It seems the uniform PRNG is unbiased but can give poor results. The biased PRNG on the other hand gives better estimate but could be biased ??

Eg. Lets take subtrees of size 1,1,1,1,1,40 for each of our 6 moves. We should use 1/6 for sampling each move.

What is causing confusion (for me was that), perft counts are average values but with min-max and alpha-beta, we are taking
Max(1,1,1,1,1,40) = 40 as expected result, if those are scores instead of node counts. So the 40 should be sampled more often than others

hgm · Post by **hgm** » Fri Jul 15, 2011 8:40 pm

When you use biased sampling, you should of course correct for the bias to get an unbiased estimate. E.g. if you would sample the 1,1,1,1,1,40 node with probabilities 5 x 1/12 and 1 x 1/2, you should not multiply the sampled value always by 6 (i.e. use 6,6,6,6,6,480 as estimate for the node total, as you would with uniform sampling), but by 12 or 2 (i.e. use 12,12,12,12,12,80 as estimate for the node total).

Daniel Shawul · Post by **Daniel Shawul** » Fri Jul 15, 2011 8:52 pm

Indeed, since it is a sum we should use uniform PRNG. All that discussion we had with an LMR perft is moot

Somehow the way it works in alpha-beta with maximizing and minizing the scores got stuck in my head. Anyway glad to know it is unbiased for all kinds of perft estimation.
I still have to look into why Uri's method gives a biased estimate.

Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool

Re: Perft(13) betting pool