Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Sun Jul 17, 2011 11:16 am

Michel wrote:
If it was another operation such as maximizing, minimizing or a product then it wouldn't have worked.
Max and min are unfortunately extremely tricky functions to treat statistically. Even if X_1,...,X_n are uniformly distributed (and independent), the distribution of max(X_1,....,X_n) and min(X_1,....,X_n) will be very skew.
Yes but a seemingly non-harmful operation such as averaging the branching factors at the _root_ instead of the nodes themselves gives very bad results. This does not add skeweness although I suspect even the regular nodes count will be a little skewed due to the possiblity of being log-normal (didn't test it yet).
I made the following test keeping everything inside the tree the same. The difference is which quantity I average : EBF or nodes count. In the EBF case , I determine the nodes count by raising to 13. And in the second case I do the opposite. You can see the EBF average gives different results and I don't think it will change with more games.

Code: Select all

perft 13
rounds 1000000
EBFAverage    24.2935 1.02652e+018
DirectAverage 25.5528 1.9802e+018
time 51.05 sec
So I doubt averaging the branching factors inside tree in anyway will ever give me satisfactory results, if averaging only at the root gives bad results.

Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Sun Jul 17, 2011 4:26 pm

Here is the probability distribution curve. It is far from normal as I suspected.
It is closer to a skewed log-normal or maybe even Pareto distribution.
This is from a sample of 1000000 games. The arithmetic mean gives 1.98e18 and geometric mean is about 1e18. For a log-normal distribution, the geometric mean is equal to the median but not the mean. The mean and median are close but not the same unlike the case of a normal distribution. I will add mean and median vertical lines to show this.
Image
Edit:
And here is the cute normal distribution of the log of the nodes count.
Well it is clear now it is indeed log-normal.
Image

Michel
Posts: 1965
Joined: Sun Sep 28, 2008 11:50 pm

Re: Perft(13) betting pool

Post by Michel » Mon Jul 18, 2011 2:31 am

Very nice.

Knowing that the estimator is log normally distributed around
the true node count should make it possible to give more accurate confidence intervals.

The fact that the distribution is log normal is not entirely
unexpected since we are looking at a product of random variables (BF's).
These branching factors are apparently sufficiently independent
so that the law or large numbers is valid.

EDIT

I still have to implement a perft function in GnuChess. That will be
a good opportunity to implement a mcperft (Monte Carlo Perft) function
as well.

Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Mon Jul 18, 2011 11:38 am

Yes the confidence interval should be skewed to the right as well and not given as a +/-. Well it is small anyway so it
may not matter much. Also I was getting the exact median of the distribution by taking geometric average of the BF's,
that had a value of exp(mean of logs). _Unlike_ most lognormal distributions the mean happens to be important than the median in
this particular case. So the calculation should have been exp(log-mean + 0.5 * log-variance). I missed the mean-median mapping before but now at least I am getting
close results after using the second formula at the _root_. I am trying to propagate the variance within the tree but so far
I am getting bad results. It is clear the variance of BFs increases at each ply since equal number of samples are taken for an
exponentially growing tree.

IMO the proof by induction hides some stuff.A pure MC simulation always averages the reward at the tips of the tree.
In this case the reward happens to be exactly the inverse of the probablity of that path being selected (which is very
unique!). I did another MC simulation with a Material + PST reward at the tips and then min-maxed it, the result was the
expected outcome of a random play i.e average of scores at the tip. The pdf follows the score distribution at the tips, which
for the opening position is "somewhat" normal with a mean close to 0.3. If I start biasing the move selection, then
I get more hits towards the better scores at the tips, with a very skewed distribution.

Michel
Posts: 1965
Joined: Sun Sep 28, 2008 11:50 pm

Re: Perft(13) betting pool

Post by Michel » Mon Jul 18, 2011 1:21 pm

I wonder if this nice log normal law persists for positions from other game stages (middle game, end game). Also does it hold for longer searches, like perft(20), perft(30) etc...?

Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Mon Jul 18, 2011 2:49 pm

I tested perft(30) and it is still log-normal. The only problem is I used 1 million games and this gives big variance. I can not use more than that because I am using excel for data analysis . And this puts the mean far from the median that I can not make a plot of the log-normal curve which shows the mean (about 2.79e44). Other than that the log transformed data still shows a good normal distribution. I don't think it will change for middle game and endgame positions either but I will try anyway.

User avatar
Ajedrecista
Posts: 1367
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Perft(13) betting pool

Post by Ajedrecista » Wed Jul 20, 2011 4:44 pm

Hi everyone!

I also made my own estimate on Perft(13) because it seems that everybody has made his/her own one!

I am not so smart for thinking in methods like Muller, Blass/Schüle, Shawul or Österlund (for example), so my estimate is another Hocus-Pocus: my number is, more less: 1.980465·10^(18) ± 0.013%. Here is the PDF file where I explain my 'method'... if it can be named in this way (I guess no):

http://www.megaupload.com/?d=RH1CQTPD

I previously did another estimate which I think it is wrong... but was the start of my second estimate. Here is the PDF, just for the record:

http://www.megaupload.com/?d=ZUJWWQWF

In the wrong estimate, I saw a pattern involving my parameters alpha and beta, but for some reason I did not care that alpha < 0 and beta < 0 when n grows (well, I saw it but did not work on it). This is why I made an improved estimate that now fits more less in your intervals or given values.

My 'method' is not a good one because it is useful only with odd values for n (it is somewhere incomplete, but I am not so clever for thinking a better one). Besides, it is Hocus-Pocus because I adjusted alpha and beta with a cubic function... but why cubic? Because I wanted a polynomial of high grade, but the maximum grade possible with four known points (n = 5, 7, 9 and 11) was 4 - 1 = 3... this is why I chose a cubic function.

If I am not wrong, dates of estimates were 6th July and 19th July; I published them at LiChess site before my account was activated in TalkChess:

http://es.lichess.org/forum/general-che ... n/perft-13

I say this because I linked in LiChess forum to some threads here in TalkChess... I hope this crossposting is not illegal; if not, I am sorry. You can comment on my second estimate, or LOL with my 'no-method'... whatever you want.

Regards from Spain.

Ajedrecista.

User avatar
Ajedrecista
Posts: 1367
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Perft(13) betting pool

Post by Ajedrecista » Fri Jul 22, 2011 11:15 am

Hi again:

There is a typo in both PDF's I uploaded: 'alpha(n) > beta(n)' is wrong; right is: '|alpha(n)| > |beta(n)|' (with absolute values). This typo does not change my final estimate:

Perft(13) ~ 1.980465e+18 ± 0.013%.

I will not upload another PDF (two are enough, IMO). I include this correction in a Notepad in the same folder as PDF's... you can do the same if you want.

Thanks for read it and sorry for the typo.

Regards from Spain.

Ajedrecista.

Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Fri Jul 22, 2011 2:00 pm

Hi Ajedrecista
Thanks for the detailed explanation of your method.
I took a quick look at it and it seems that your arithmetic and geometric averages gives close results. My estimate of the geometric average is usually very far from the arithmetic one. Also how did you come up with the
formula you used to calculate the ratio of BFs at odd and even plies. I understand it gives you a decreasing trend for both odd and even plies, so I was wondering if there is a theory behind it.
Thanks
Daniel

Daniel Shawul
Posts: 3498
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting pool

Post by Daniel Shawul » Fri Jul 22, 2011 2:21 pm

I tested other positions as well and it seems the trend is there all the time.
Anyway here is a method to estimate the perft using this lognormal _assumption_. It can give very bad results when this assumption does not hold like in perft(3)/perft(4) which is better represented by a normal distribution. In that case it gives twice the exact value! The reason is that the upper two branching factors are constant i.e 20, so perft(3) will be the average of the BF at ply 3.

Anyway here is a method which beats around the bush instead of calculating the arithmetic mean directly. It is clearly inferior and only reason I am doing this is to finish off what I started. It gives an estimate 2x less than correct value for perft(20) and above..
perft 13
20 +/- 1.000 20 [1000000 : 1000000]
20 +/- 1.000 400 [1000000 : 1000000]
22.0222 +/- 1.151 8808.89 [1000000 : 1000000]
21.9018 +/- 1.171 192930 [1000000 : 1000000]
23.8293 +/- 1.218 4.59739e+006 [999961 : 1000000]
23.5948 +/- 1.254 1.08475e+008 [999909 : 1000000]
25.2617 +/- 1.275 2.74025e+009 [999834 : 1000000]
24.9833 +/- 1.309 6.84606e+010 [999734 : 1000000]
26.401 +/- 1.318 1.80743e+012 [999633 : 1000000]
26.1184 +/- 1.348 4.72072e+013 [999501 : 1000000]
27.3091 +/- 1.356 1.28919e+015 [999357 : 1000000]
27.0343 +/- 1.380 3.48522e+016 [999209 : 1000000]
28.0532 +/- 1.385 9.77716e+017 [999061 : 1000000]
SUM 25.6741 2.10595e+018
nodes 2.10595e+018 YAE (yet another estimate)
time 49.34 sec
Proceduce:

a) Take natuaral log of legal moves at each ply and keep mean and variance:

Code: Select all

        log&#40;nmoves&#41;;
        mean
        variance
        
b) For each ply

Code: Select all

    Mean BF&#58;  exp&#40;mean&#41;
    Std BF&#58; exp&#40;sqrt&#40;variance&#41;)
    
c) Multiply and get geometric mean of perft(13)
The relative error will be _added_ when multiplying so.

Code: Select all

    error += &#40;std / mean&#41;
    
Uing lognormal assumption calculate a multiplying factor
and multiply the geometric mean to get the arithmetic average.

Code: Select all

    double mm = &#40;1 + sqrt&#40;1 + 4 * error * error&#41;) / 2.0;
    return mm * nodes;
    

Post Reply