@Steven,
Did you compute perft(12) by breaking them down to perft(8)? Can I have the perft(8) results if that is the case? I want to use them for my perft estimations.
Perft(13) betting pool
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: A few more records
I think the line for the variance must be like this:Sven Schüle wrote:Hmm, it seems it is close to impossible to detect that equivalence since my implementation is wrong I ignored the fact that variance(n) refers to mean(n) but variance(n+1) must refer completely to mean(n+1). Therefore it does not work the way I did it.Sven Schüle wrote:Hi Michel,Michel wrote:GNU Chess 5.07.172b http://emis.uhasselt.be/cgi-bin/gitweb. ... ;a=summary Peter1 method with fw=4, 501e6 nodes.
sdn=sd*sqrt(nodes) is the normalized standard deviation
mean=1.540774e+14 sd=8.298100e+09 [ 1.540564e+14 <= Perft(10) <= 1.540985e+14 (99% conf.) ] nodes=501262105 sdn=1.857852e+14 time=1931.75s
in the latest snapshot of GNU Chess I see this code in cmd.c (function cmd_perftmc) for incremental update of mean + variance:While I perfectly agree to the mean calculation I am not sure about the variance, could you please explain how that works? One point is that I don't know how you get around the division by 0 in the "variance=..." line for the first sample, i.e. when "n++" sets n to 1. But I am also asking since I get different (significantly higher) values for the same "Peter1" method in my program. I thought that the variance of a set X of N samples were defined by a formula like this (note the division by N-1, not N):Code: Select all
n++; delta=perft-mean; mean=mean+delta/n; M2+=delta*(perft-mean); variance=M2/((n-1)*n);
and therefore I use the following code for incremental update of mean and variance (translated to GNU Chess variable names):Code: Select all
variance(X, N) := sum(i=1..N) ((X(i) - mean(X, N))^2) / (N-1)
I have been unable so far to detect an equivalence between your implementation and mine. So may I ask you what you think is right, and what is wrong?Code: Select all
mean = (n * mean + perft) / (n + 1); variance = (n == 0) ? 0.0 : ((n - 1) * variance + pow(perft - mean, 2)) / n; ++n;
Sven
I also have an idea why your code will not crash with "division by 0". It is due to the first value of M2 being 0, and the compiler might optimize 0/0 simply into 0 (which is mathematically incorrect, though).
I am still struggling to understand why your formula works ...
Sven
Code: Select all
variance=M2/(n-1);
Your algorithm looks very much like the one under "On-line algorithm" in Wikipedia which refers to D. Knuth and Welford.
With the above correction I get almost identical results for your implementation and for the "perfect" variance that I get if I store all sample values in a vector and calculate mean and variance once in the end.
Sven
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: A few more records
This would be the variance for a single sample (that is an estimator for thevariance=M2/(n-1);
variance of the probability distribution of a sample). To get the variance for the mean we have to make an extra division by n.
You are right about the division by zero. I thought that floating point divisions by zero were always silently ignored. But apparently this is not guaranteed. So I put the test for n>1 a bit earlier in the code to avoid this.
And yes the code comes from Wikipedia!
-
- Posts: 4675
- Joined: Mon Mar 13, 2006 7:43 pm
Re: A few more records
Alas, I tossed the draft 8 records for that run; there were over 70 thousand of them. I do have all of the draft 10 and 11 records, though.Daniel Shawul wrote:@Steven,
Did you compute perft(12) by breaking them down to perft(8)? Can I have the perft(8) results if that is the case? I want to use them for my perft estimations.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: A few more records
I did not find your version of that algorithm in Wikipedia, can you give me a link?Michel wrote:This would be the variance for a single sample (that is an estimator for thevariance=M2/(n-1);
variance of the probability distribution of a sample). To get the variance for the mean we have to make an extra division by n.
You are right about the division by zero. I thought that floating point divisions by zero were always silently ignored. But apparently this is not guaranteed. So I put the test for n>1 a bit earlier in the code to avoid this.
And yes the code comes from Wikipedia!
I also did not find a source that explains that "variance for the mean" := "variance for a single sample" / "number of samples". I would like to understand it, and also why we need the "variance for the mean" in our case, so could you please give me also a link for that part?
Sven
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: A few more records
http://en.wikipedia.org/wiki/Algorithms ... g_varianceI did not find your version of that algorithm in Wikipedia, can you give me a link?
The mean of a sample is an unbiased estimator for the mean (also calledwhy we need the "variance for the mean"
expectation value) of a probability distribution. This is basically
E((X1+...+Xn)/n)=(E(X1)+...+E(Xn))/n=(n*E(X1))/n=E(X1)
To know how good an estimator it is we have to compute its variance.
To this end we use the following rules:
Var(Z/n)=Var(Z)/n^2
and if U V are independent
Var(U+V)=Var(U)+Var(V)
With these rules we can compute the variance of the mean of a sample
Var((X1+...+Xn)/n)=Var(X1+....+Xn)/n^2=n*Var(X1)/n^2=Var(X1)/n.
EDIT:
To clear up some possible confusion. In prior discussions the term "sample"
has sometimes been used for a single measurement. This usage
is correct in digital signal processing but not in statistics where
a sample means a set of measurements. For this reason Peter refers
to a single run of his algorithm as a "cycle".
EDIT2:
Another possible confusion is the formula
Var((X1+...+Xn)/n)=Var(X1)/n.
We don't know Var(X1) but we may estimate it by taking the variance
of the same sample. This involves a division by n-1. So in total we
divide by n*(n-1).
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: A few more records
Thanks Michel, now I understand. But the Wikipedia link you gave is the same from which I quoted some posts above, and there the algorithm only divides by (n-1), not by (n-1)*n. However, after your explanation it is roughly clear for me where the additional division by n comes from.Michel wrote:http://en.wikipedia.org/wiki/Algorithms ... g_varianceI did not find your version of that algorithm in Wikipedia, can you give me a link?
The mean of a sample is an unbiased estimator for the mean (also calledwhy we need the "variance for the mean"
expectation value) of a probability distribution. This is basically
E((X1+...+Xn)/n)=(E(X1)+...+E(Xn))/n=(n*E(X1))/n=E(X1)
To know how good an estimator it is we have to compute its variance.
To this end we use the following rules:
Var(Z/n)=Var(Z)/n^2
and if U V are independent
Var(U+V)=Var(U)+Var(V)
With these rules we can compute the variance of the mean of a sample
Var((X1+...+Xn)/n)=Var(X1+....+Xn)/n^2=n*Var(X1)/n^2=Var(X1)/n.
EDIT:
To clear up some possible confusion. In prior discussions the term "sample"
has sometimes been used for a single measurement. This usage
is correct in digital signal processing but not in statistics where
a sample means a set of measurements. For this reason Peter refers
to a single run of his algorithm as a "cycle".
EDIT2:
Another possible confusion is the formula
Var((X1+...+Xn)/n)=Var(X1)/n.
We don't know Var(X1) but we may estimate it by taking the variance
of the same sample. This involves a division by n-1. So in total we
divide by n*(n-1).
Sven
-
- Posts: 689
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: Summary of Monte Carlo results for perft(12) so far
I agree with this. In retrospect, it would have been better to measure the effect of increased full-width search in the HGM and Peter1 methods before spending too much time on the UCT/weighted sampling methods.Daniel Shawul wrote:I think the reason for the sudden jump in performance of full_width +MC is because up to now for depth <= 3, it was same as _uniform_ allocation. But for depth = 5 and more it becomes proportioning by sub-tree nodes count up to leaf nodes. It was a sub-tree count proportioning all along but we didn't notice it because sub-trees are uniform for depth<=3. So with that in mind it can be beaten f.i if I do full-width expansion to depth=5. Depth=6 would require too much memory though , which is a disadvantage for any breadth first searcher. Having said that it can perform bad for a selective tree where the shape of the upper portion of the tree is not related with the sub-tree size. Another idea applicable for the perft(13) competition would be to take the exact perft(12) results for 2-plies from the root and use that to determine the proportions for the monte-carlo simulations. The depth-first search does a static allocation, so a static allocation based on the real sub-tree size may perform better.
-
- Posts: 689
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: A few more records
sje wrote:The perft(13) run has been going for just over 12 days and has produced 210 draft ten records. Here are the most recent ten of these:How well do your estimation algorithms perform on the above?Code: Select all
rnbqkbnr/1ppppppp/8/p7/1P6/2P5/P2PPPPP/RNBQKBNR b KQkq - 0 2 10 154068837107809 rnbqkbnr/1ppppppp/8/p7/2P5/7N/PP1PPPPP/RNBQKB1R b KQkq - 0 2 10 126452584569476 rnbqkbnr/1ppppppp/8/p7/8/1P1P4/P1P1PPPP/RNBQKBNR b KQkq - 0 2 10 197935747435700 rnbqkbnr/1ppppppp/8/p7/8/P2P4/1PP1PPPP/RNBQKBNR b KQkq - 0 2 10 163777451315240 rnbqkbnr/1ppppppp/8/p7/2P5/5N2/PP1PPPPP/RNBQKB1R b KQkq - 0 2 10 161971799868266 rnbqkbnr/1ppppppp/8/p7/P7/3P4/1PP1PPPP/RNBQKBNR b KQkq - 0 2 10 166336655168294 rnbqkbnr/1ppppppp/8/p7/8/N3P3/PPPP1PPP/R1BQKBNR b KQkq - 0 2 10 290223899535804 rnbqkbnr/1ppppppp/8/p7/1PP5/8/P2PPPPP/RNBQKBNR b KQkq - 0 2 10 185603264146369 rnbqkbnr/1ppppppp/8/p7/8/3P3N/PPP1PPPP/RNBQKB1R b KQkq - 0 2 10 181649457773953 rnbqkbnr/1ppppppp/8/p7/8/2N1P3/PPPP1PPP/R1BQKBNR b KQkq - 0 2 10 406608149434105
Using "Peter1" with fw=4:sje wrote:The average time for calculating a draft 10 record in the current run is about 5,013 seconds. It would be interesting to see the quality of estimations that can be computed in, say, one percent of the actual full calculation time.
Code: Select all
pos m relerr movGens cycles t
1 154093378234301 1.593e-04 31099978 17 53.073
2 126467517170116 1.181e-04 33530832 21 50.747
3 197915769847456 -1.009e-04 34568046 15 52.250
4 163820486561991 2.628e-04 34776633 17 52.521
5 161965379089192 -3.964e-05 35094578 19 51.665
6 166368914541236 1.939e-04 34560150 17 52.240
7 290318396860821 3.256e-04 32154093 12 52.511
8 185597417904519 -3.150e-05 33747973 17 52.674
9 181579829695344 -3.833e-04 34778534 16 52.431
10 406543544328447 -1.589e-04 31748598 10 52.491
-
- Posts: 134
- Joined: Mon May 16, 2011 6:58 pm
- Location: Denmark
Re: Summary of Monte Carlo results for perft(12) so far
Here is a idea to create upper limit:
Make an engine only using AB , and then search to depth 13, call it x
Then because AB can prune down to sqrt(n) , where n is the number of nodes searched in minimax
So , the upper limit is x*x !!
Now, i only need to find lower limit, then i will place my bet.
I dont know if this idea was written before, i didnt look through the 51 pages
Make an engine only using AB , and then search to depth 13, call it x
Then because AB can prune down to sqrt(n) , where n is the number of nodes searched in minimax
So , the upper limit is x*x !!
Now, i only need to find lower limit, then i will place my bet.
I dont know if this idea was written before, i didnt look through the 51 pages