Well I already stated my confusion earlier than you did but got no reply... I tried to compare the values directly with a 500 mil number of simulation result and found out it lies somewhere in the 4-ply and 5-ply result. HG did not correct me until your post today.. Yes we all understand what is going on now obviously.Michel wrote:Well not really. The meaning of "nodes" is quite standard in computer chess. I just wanted to be absolutely sure. That's why I asked.Michel also got confused..
I should have read back the relevant post where it is unambiguously stated that nodes refer to move generations. I had overlooked it.
Perft(13) betting pool
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
-
- Posts: 27790
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
I have got the final verdict on the accuracy of the 6-ply full width:
I calculated the mean and std deviation of terminal sub-trees of different depth of the perft(12) tree. For this I used the same sampling method as I use in the perft estimation, i.e. attenuate by a factor 32 each ply after the 6th. But I only take about 1000 of the thus selected sub-trees for complete expansion. (Which should be more than enough for an accurate std deviation estimate.)
I then divide this std deviation of sub-tree size by the number of sub-trees I sample in the perft(12) estimation (which is the nr of nodes at that level, divided by the applicable power of 32). This gives the std deviation introduced by the sampling of that level.
Finally I quadratically add the standard deviations of all levels together, which leads to a total of 0.00107%.
Note that there is also an odd-even effect in the variances: the difference between a 2- and 3-ply tree-size variance is much larger than between 1- and 2-ply. This is of course because positions with high mobility (= perft(1)) tend to have daughters with high mobility as well, but not necessarily above-average opponent mobility.
Due to this depth-correlation of mobilities, it would make sense to select moves from nodes with high mobility with a larger probability. By how much would be dependent on the remaining depth. This could be implemented, but the accounting would become more complex: in stead of single accepted/rejected counters per level, you would have to keep such a counter for every mobility (ignoring checks). And you would have to weight the branches relative to each other, as Peter does, by their inverse acceptance probability).
Another remark is that the attenuation 1/32 seems pretty optimal: it more or less flattens out the increase in variance towards lower plies, because it puts more samples there. Perhaps a little bit could be gained by reducing the number of samples at the 10-th ply, and devoting them to the 7th or 9th. (Although samples at the 10th ply are cheaper than at the 7th, because the remaining path is shorter.)
Code: Select all
perft(5) tot = 1499 mean = 19851211.416278 std = 59.598497% / sqrt(99.9M) = 0.00596% (cumulative 0.0107%)
perft(4) tot = 1282 mean = 736316.900936 std = 47.666721% / sqrt(83.0M) = 0.00524% (cumulative 0.00900%)
perft(3) tot = 1150 mean = 25486.833043 std = 41.495897% / sqrt(74.4M) = 0.00581% (cumulative 0.00732%)
perft(2) tot = 1025 mean = 906.915122 std = 28.374558% / sqrt(66.1M) = 0.00349% (cumulative 0.00456%)
perft(1) tot = 975 mean = 29.978462 std = 21.894053% / sqrt(62.5M) = 0.00277%
I then divide this std deviation of sub-tree size by the number of sub-trees I sample in the perft(12) estimation (which is the nr of nodes at that level, divided by the applicable power of 32). This gives the std deviation introduced by the sampling of that level.
Finally I quadratically add the standard deviations of all levels together, which leads to a total of 0.00107%.
Note that there is also an odd-even effect in the variances: the difference between a 2- and 3-ply tree-size variance is much larger than between 1- and 2-ply. This is of course because positions with high mobility (= perft(1)) tend to have daughters with high mobility as well, but not necessarily above-average opponent mobility.
Due to this depth-correlation of mobilities, it would make sense to select moves from nodes with high mobility with a larger probability. By how much would be dependent on the remaining depth. This could be implemented, but the accounting would become more complex: in stead of single accepted/rejected counters per level, you would have to keep such a counter for every mobility (ignoring checks). And you would have to weight the branches relative to each other, as Peter does, by their inverse acceptance probability).
Another remark is that the attenuation 1/32 seems pretty optimal: it more or less flattens out the increase in variance towards lower plies, because it puts more samples there. Perhaps a little bit could be gained by reducing the number of samples at the 10-th ply, and devoting them to the 7th or 9th. (Although samples at the 10th ply are cheaper than at the 7th, because the remaining path is shorter.)
Last edited by hgm on Fri Aug 12, 2011 5:10 pm, edited 1 time in total.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
which iswhich leads to a total of 0.00107%.
6.726159e+11
which seems quite spectacular.
However the actual error in your result is
6.284515e+16-62854969236701747=-9.819237e+12
which is 15 times bigger.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
I believe it is 0.0107% not 0.00107% as shown in the table.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Yes that makes sense. In that case UCT is still a bit aheadI believe it is 0.0107% not 0.00107% as shown in the table.
4e12 * sqrt(12 * 100e6)=1.4e17
against
6.726159e+12 * sqrt(510055595 )=1.52e+17
but not by much.
As Sven was pointing out, perhaps the important factor 12 can be reduced by organizing the computations more cleverly.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Besides that, we are comparing apples and oranges. The UCT part has been compared with uniform selection and shown to give a significant improvement. Now if HG's selective method catches up with UCT it only means Peter's MC is hurting it since I use that. Also my test with UCT never managed to expand all leaf nodes at depth=6 so the comparison should be done at depth=5 ...
Why don't we do a direct comparison of HG's selective method with Peter's MC for a depth=0 and depth=5 expansion. We can then clearly see which one is better. We can do comparison based on nodes count only inside the selective (MC) part i.e excluding all nodes in the full width section since both do it the same way there...
Why don't we do a direct comparison of HG's selective method with Peter's MC for a depth=0 and depth=5 expansion. We can then clearly see which one is better. We can do comparison based on nodes count only inside the selective (MC) part i.e excluding all nodes in the full width section since both do it the same way there...
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Defect of a node?
I don't think so, the 0.0107% is from the table HG posted and refers to perft(5) subtree results but the 0.00107% is from this part of his text:Michel wrote:Yes that makes sense.I believe it is 0.0107% not 0.00107% as shown in the table.
where the matching of the digits 107 seems to be coincidental.Finally I quadratically add the standard deviations of all levels together, which leads to a total of 0.00107%.
My stronger proposal was to simply add a node counter instead of estimating its value with a "hocus pocus" formulaMichel wrote:As Sven was pointing out, perhaps the important factor 12 can be reduced by organizing the computations more cleverly.
Sven
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
The table lists the cumulative totals. I think the extra zero is simply a typo.Finally I quadratically add the standard deviations of all levels together, which leads to a total of 0.00107%.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Defect of a node?
O.k., might be true.Michel wrote:The table lists the cumulative totals. I think the extra zero is simply a typo.Finally I quadratically add the standard deviations of all levels together, which leads to a total of 0.00107%.
Sven
-
- Posts: 27790
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
Indeed, that was a typo in the text.Daniel Shawul wrote:I believe it is 0.0107% not 0.00107% as shown in the table.
This is my proposal for an improved method, which now becomes a cross-over between my old method and Peter's:
Code: Select all
double accepted[MAXPLY], total[MAXPLY];
double perft(int depth, double w)
{
int res=0;
double weight;
n = MoveGen(); // returns number of legal moves (ignoring existing checks)
weight = f(n,depth);
for(ALL_MOVES) {
if(depth == 1) res += w; else
if(AcceptWithProbability(1./weight)) {
accepted[depth += weight;
MakeMove();
res += perft(depth-1, w*weight);
UnMake();
}
total[depth]++;
}
return res;
}
The finesse is in the function f(n, depth), which determines the weights. In my original method f(n, depth) would always be 32. To make it more like Peter's original method it would be n, so that on average one move per node will be accepted. (Of course it still differs, because he uses exactly one move per node.) But this is a correction in the wrong direction. What I would do in stead is take f(n, d) = 32*n^((d-1)/2), where (d-1)/2 (integer division) is the number of ply the current stm has to move in future nodes within the depth.