Made a mistake with bulk node counting re-running now ...
But I don't think it is making much of a difference but who knows.
Perft(13) betting pool
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Here is the new result. The SD is slightly higher
I don't know if there are still mistakes in my counting so this result should be confirmed by someone else and taken with a grain of salt till then...
SD = (7.111e12 / 6.287e16) * 100 = 0.0113% much smaller than 0.017% for the same full width expansion. It also beats the depth=5 results.
And log file here
Code: Select all
499M 6.286812e+016 +- 7.128241e+012 499590698 5.92sec
501M 6.286848e+016 +- 7.111823e+012 501374891 5.91sec
nodes 6.28685e+016
time 1668.55 sec
SD = (7.111e12 / 6.287e16) * 100 = 0.0113% much smaller than 0.017% for the same full width expansion. It also beats the depth=5 results.
And log file here
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Are you sure the mistake is not still there?
Your error in the first case is
62854969236701747-6.286812e+016=-1.315076e+13
which is twice your sd.
The second example is similar. The probability of this happening twice is extremely small.
Anyway, what method is this? A fullDepth 4 ply expansion followed by
uniform random walks in Peter's sense?
Your error in the first case is
62854969236701747-6.286812e+016=-1.315076e+13
which is twice your sd.
The second example is similar. The probability of this happening twice is extremely small.
Anyway, what method is this? A fullDepth 4 ply expansion followed by
uniform random walks in Peter's sense?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Like I said I am not sure of anything. Some body else needs to confirm it. About the difference between the sd and the error, HG had also a higher error than the sd in the last calculation you did. That is about 1.46. And mine is 1.85x larger... Maybe just luck maybe a mistake I don't know. Yes it is perft(4) = 197281 full width then MC. Why don't you do the experiment yourself since you probably have that now? For the second test I turned off bulk counting the last ply and relied on MC, since I noticed HG don't have that. That is the only difference.
Edit: I don't think I made a counting mistake because
197281 leaf positions with a depth of 8 MC = 197281 * 8 = 1578248
After one visit , the number of internal nodes = 197281 + 8902 + 400 + 40 = 206623
So total = 1578248 + 206623 = 1784871. If you take the difference of two consecutive iterations in the second log-file that is about what you get.
And in total there are 281 iterations = 281 * 1784871 = 501548751 nodes
Edit: I don't think I made a counting mistake because
197281 leaf positions with a depth of 8 MC = 197281 * 8 = 1578248
After one visit , the number of internal nodes = 197281 + 8902 + 400 + 40 = 206623
So total = 1578248 + 206623 = 1784871. If you take the difference of two consecutive iterations in the second log-file that is about what you get.
Code: Select all
1 1M 6.29E+16 +- 0.00E+00 1784217 5.91sec 1784217
2 3M 6.29E+16 +- 1.05E+11 3568383 5.92sec 1784166
3 5M 6.29E+16 +- 8.14E+12 5352655 5.92sec 1784272
4 7M 6.29E+16 +- 2.78E+13 7136898 5.92sec 1784243
5 8M 6.29E+16 +- 2.88E+13 8921182 6.05sec 1784284
6 10M 6.29E+16 +- 2.55E+13 10705408 5.92sec 1784226
7 12M 6.28E+16 +- 2.54E+13 12489611 5.92sec 1784203
8 14M 6.28E+16 +- 2.22E+13 14273907 5.92sec 1784296
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
Well, d=4 full-width cannot possibly beat d=5, whatever the MC method is used to guess the sub-tree sizes. There must be some extra variance due to sampling the 5th ply rather than calculating it exactly. (Unless all sub-trees hanging from the same d=4 node have exactly the same size.)Daniel Shawul wrote:Here is the new result. The SD is slightly higherI don't know if there are still mistakes in my counting so this result should be confirmed by someone else and taken with a grain of salt till then...Code: Select all
499M 6.286812e+016 +- 7.128241e+012 499590698 5.92sec 501M 6.286848e+016 +- 7.111823e+012 501374891 5.91sec nodes 6.28685e+016 time 1668.55 sec
SD = (7.111e12 / 6.287e16) * 100 = 0.0113% much smaller than 0.017% for the same full width expansion. It also beats the depth=5 results.
And log file here
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
I haven't computed from depth=5 yet, so the comparison was against your depth=5. I am sure the counting is correct but may be there are problems with calculating variance with small number of samples. When I did tests of limited depth UCT , I just back propagated the variances and got 6.96e12 for uniform sampling from depth=3. I am trying to figure out what is going on...
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
I think the incremental mean and variance calculation is also correct. I did the computation again to get the perft() estimates for each of the 281 samples which were missing previously. Then I computed the mean and SD by excel which matches exactly to previous incrementally computed mean and SD. You can compute that your selves
I think that one sample that we consider here has about 197281 leaf nodes. Do we consider it as one sample or as many samples ? I remember a discussion here where the 8902 leaf nodes of perft(3) are to be considered separately ?? Either that or we got back-propagation of variance in UCT wrong because of CLT assumption which may not be acceptable. Ideas ?
Log file here
Excel calculation here
I think that one sample that we consider here has about 197281 leaf nodes. Do we consider it as one sample or as many samples ? I remember a discussion here where the 8902 leaf nodes of perft(3) are to be considered separately ?? Either that or we got back-propagation of variance in UCT wrong because of CLT assumption which may not be acceptable. Ideas ?
Log file here
Excel calculation here
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Well I am getting an even lower variance than Daniel which is weird.
The result is below. The "cost" is defined as sd*sqrt(nodes).
The result is below. The "cost" is defined as sd*sqrt(nodes).
6.283206e+16 <= Perft(12) <= 6.286568e+16 (99% conf.) sd=6.619414e+12 nodes=507882288 cost=1.491767e+17 time=1382.81s
-
- Posts: 689
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Summary of Monte Carlo results for perft(12) so far
OK, here is an attempt to summarize all monte-carlo based results for perft(12) so far:
References:
Peter1: http://talkchess.com/forum/viewtopic.ph ... 43&t=39678
Peter2: http://talkchess.com/forum/viewtopic.ph ... 37&t=39678
Daniel: http://talkchess.com/forum/viewtopic.ph ... 77&t=39678
HGM: http://talkchess.com/forum/viewtopic.ph ... 03&t=39678
Michel: http://talkchess.com/forum/viewtopic.ph ... 01&t=39678
Explanation of the columns:
fw is the number of full-width plys. Not applicable for all methods. For the Peter2 method, the maximum size of the in-memory tree is reported instead.
movGen is the number of calls to the routine that generates all valid moves in a position. I think this number is well defined for all methods, but I realize that some implementations may not count this.
cycles is the number of generated independent estimates of the final result. Most methods calculate the final result as the mean value of all generated independent estimates, and also compute the standard deviation estimates from the independent estimates. I don't think this is applicable to Daniels method though, as I imagine the final estimate (both mean and variance) is the sum of the leaf node estimates. So even though "cycles" is technically equal to 1, you can still get a very good estimate of the mean and variance.
m is the estimated perft value. st is the estimated standard deviation. dev is (m-perft(12))/st, that is how many standard deviations away from the true result the estimated value is.
st*sqrt(movGen) is one measure of algorithm efficiency (lower values are better of course), which should be a good estimate if the major cost is generating legal move lists.
t is the run time in seconds.
st^2*t is a measure of algorithm+implementation+CPU efficiency (lower values are better). The obvious caveats apply if you try to compare results from different persons.
Comments
Please fill in the blanks. I have tried to locate the latest results from all contributors, but this thread is getting really hard to follow now, so I have probably missed some results.
Note that I have run some of my tests more than once with the same parameters, in order to get some feeling of how repeatable they are. The Peter2 method has very repeatable results, except that the run time varied a bit more than I would have expected.
For the Peter1 method with fw=5, I let the test run for 2e9 moveGens, but also reported the result after the first 5e8 moveGens. It looks like 15 cycles is too little to estimate the standard deviation, and the 5e8 result was probably a bit lucky.
The best method, of the ones where results are available, if there is a limit of 5e8 moveGens is Peter2. However, it is unknown to me what Daniel's method would give for this case. Also, if the HGM method were to be used, one would obviously use fw=6, which has an unknown standard deviation.
The best method, of the ones where results are available, if there is a limit of 5e8 moveGens, but CPU time is the metric, is Peter1 with fw=5.
The HGM and Peter1 methods has the advantage of requiring almost no memory, so you will typically pick fw as large as possible, after you have decided how much time you want to spend on the computation. In contrast, the Peter2 method scales badly, because you will typically run out of memory before you run out of CPU time.
I am currently running Peter1 with fw=6. I will report on that later when I have enough data. Note that if the limit is 5e8 moveGens, fw=6 will not produce any result for Peter1, but will just barely make it for HGM.
@Daniel, I don't understand how you use the second sequence of random numbers to get unbiased results, so I can't compute st*sqrt(movGen) for your data.
@Michel, regarding doing several random walks each time a leaf node is reached: The Peter2 method stores moves in the tree nodes, which means that I don't have to generate any move lists to get from the root position to a leaf node.
@Daniel, I hope my hands are dirty enough now.
Code: Select all
Full width + uniform random walk (Peter1)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 3 5.0e8 6210 6.285323e16 8.8445e12 -0.20 1.9777e17 746 5.8356e28
12 3 5.0e8 6210 6.284147e16 8.5806e12 -1.57 1.9187e17 828 6.0963e28
12 4 5.0e8 316 6.285847e16 6.8600e12 0.51 1.5339e17 768 3.6142e28
12 5 5.1e8 15 6.285046e16 4.9581e12 -0.91 1.1239e17 886 2.1778e28
12 5 2.0e9 59 6.285395e16 2.8230e12 -0.36 1.2691e17 3496 2.7862e28 # Continuation of previous run
Tree growth with weighted random walk + uniform random walk (Peter2)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 4M 5.0e8 69715 6.285204e16 4.6363e12 -0.63 1.0367e17 1357 2.9169e28
12 4M 5.0e8 69711 6.285458e16 4.6198e12 -0.08 1.0330e17 1442 3.0775e28
Daniels UCT based method
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 - - - 6.284728e16 3.8377e12 -2.00 - - -
HGMs full width + uniform move acceptance probability 1/32
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 3 5e8 14022 6.286672e16 1.4147e13 0.83 3.1635e17 - -
12 4 5e8 588 6.284380e16 1.0986e13 -1.02 2.4565e17 - -
12 5 5e8 24 6.286130e16 7.4099e12 0.85 1.6569e17 - -
12 6 5e8 1 6.284515e16 - - - - -
Michel (Daniel's method?)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 - - - 6.284887e16 6.6194e12 -0.92 - 1383 6.0598e28
Peter1: http://talkchess.com/forum/viewtopic.ph ... 43&t=39678
Peter2: http://talkchess.com/forum/viewtopic.ph ... 37&t=39678
Daniel: http://talkchess.com/forum/viewtopic.ph ... 77&t=39678
HGM: http://talkchess.com/forum/viewtopic.ph ... 03&t=39678
Michel: http://talkchess.com/forum/viewtopic.ph ... 01&t=39678
Explanation of the columns:
fw is the number of full-width plys. Not applicable for all methods. For the Peter2 method, the maximum size of the in-memory tree is reported instead.
movGen is the number of calls to the routine that generates all valid moves in a position. I think this number is well defined for all methods, but I realize that some implementations may not count this.
cycles is the number of generated independent estimates of the final result. Most methods calculate the final result as the mean value of all generated independent estimates, and also compute the standard deviation estimates from the independent estimates. I don't think this is applicable to Daniels method though, as I imagine the final estimate (both mean and variance) is the sum of the leaf node estimates. So even though "cycles" is technically equal to 1, you can still get a very good estimate of the mean and variance.
m is the estimated perft value. st is the estimated standard deviation. dev is (m-perft(12))/st, that is how many standard deviations away from the true result the estimated value is.
st*sqrt(movGen) is one measure of algorithm efficiency (lower values are better of course), which should be a good estimate if the major cost is generating legal move lists.
t is the run time in seconds.
st^2*t is a measure of algorithm+implementation+CPU efficiency (lower values are better). The obvious caveats apply if you try to compare results from different persons.
Comments
Please fill in the blanks. I have tried to locate the latest results from all contributors, but this thread is getting really hard to follow now, so I have probably missed some results.
Note that I have run some of my tests more than once with the same parameters, in order to get some feeling of how repeatable they are. The Peter2 method has very repeatable results, except that the run time varied a bit more than I would have expected.
For the Peter1 method with fw=5, I let the test run for 2e9 moveGens, but also reported the result after the first 5e8 moveGens. It looks like 15 cycles is too little to estimate the standard deviation, and the 5e8 result was probably a bit lucky.
The best method, of the ones where results are available, if there is a limit of 5e8 moveGens is Peter2. However, it is unknown to me what Daniel's method would give for this case. Also, if the HGM method were to be used, one would obviously use fw=6, which has an unknown standard deviation.
The best method, of the ones where results are available, if there is a limit of 5e8 moveGens, but CPU time is the metric, is Peter1 with fw=5.
The HGM and Peter1 methods has the advantage of requiring almost no memory, so you will typically pick fw as large as possible, after you have decided how much time you want to spend on the computation. In contrast, the Peter2 method scales badly, because you will typically run out of memory before you run out of CPU time.
I am currently running Peter1 with fw=6. I will report on that later when I have enough data. Note that if the limit is 5e8 moveGens, fw=6 will not produce any result for Peter1, but will just barely make it for HGM.
@Daniel, I don't understand how you use the second sequence of random numbers to get unbiased results, so I can't compute st*sqrt(movGen) for your data.
@Michel, regarding doing several random walks each time a leaf node is reached: The Peter2 method stores moves in the tree nodes, which means that I don't have to generate any move lists to get from the root position to a leaf node.
@Daniel, I hope my hands are dirty enough now.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Summary of Monte Carlo results for perft(12) so far
I filled in the blanks for the data point contributed by me. It
is just an instance of the Peter1 method. Daniel was
surprised that Peter1 performed so well so he asked for
confirmation.
I also completed HGM's entry.
is just an instance of the Peter1 method. Daniel was
surprised that Peter1 performed so well so he asked for
confirmation.
I also completed HGM's entry.
Code: Select all
Full width + uniform random walk (Peter1)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 3 5.0e8 6210 6.285323e16 8.8445e12 -0.20 1.9777e17 746 5.8356e28
12 3 5.0e8 6210 6.284147e16 8.5806e12 -1.57 1.9187e17 828 6.0963e28
12 4 5.0e8 316 6.285847e16 6.8600e12 0.51 1.5339e17 768 3.6142e28
12 5 5.1e8 15 6.285046e16 4.9581e12 -0.91 1.1239e17 886 2.1778e28
12 5 2.0e9 59 6.285395e16 2.8230e12 -0.36 1.2691e17 3496 2.7862e28 # Continuation of previous run
Tree growth with weighted random walk + uniform random walk (Peter2)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 4M 5.0e8 69715 6.285204e16 4.6363e12 -0.63 1.0367e17 1357 2.9169e28
12 4M 5.0e8 69711 6.285458e16 4.6198e12 -0.08 1.0330e17 1442 3.0775e28
Daniels UCT based method
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 - - - 6.284728e16 3.8377e12 -2.00 - - -
HGMs full width + uniform move acceptance probability 1/32
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 3 5e8 14022 6.286672e16 1.4147e13 0.83 3.1635e17 - -
12 4 5e8 588 6.284380e16 1.0986e13 -1.02 2.4565e17 - -
12 5 5e8 24 6.286130e16 7.4099e12 0.85 1.6569e17 - -
12 6 5e8 1 6.284515e16 6.7244e12 -1.46 1.5036e17 - -
Michel (Peter1)
perf fw movGen cycles m st dev st*sqrt(movGen) t (s) st^2*t (s)
12 4 5.08e8 ? 6.284887e16 6.6194e12 -0.92 1.4917e17 1383 6.0598e28