Perft(13) betting pool

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 5:45 pm

A better estimate at the 80th iteration

Perft 6.156842e+016 +- 3.598160e+012 6.285619e+016 +- 1.237483e+012

I now see that Peter uses a depth limit for tree expansion of ply < 5. That was one of my two solutions I provided to solve the problem. I tested with ply < 3 and got unbiased estimates so may be I should recalculate with depth limit of 5. But also it is not clear if that was a solution because I used a formula which forces the SD for each move to be equal, and it became biased! So there might still be bias in there even if you use depth limits.
But I really really want to understand what is going on because I can adjust just a couple of parameters and lower the variance down but there is no obvious explanation.

The 3.33 was for the one which uses the wasted nodes.
Edit: Well it ended up much worse. So I am going back to depth limited depth = 5. It is not clear if it will solve the problem completely even though you get an acceptable result. I can come up with a formula that will make it biased.

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 6:41 pm

To give you a perspective of what is happening look at this result for a depth limited version of depth=4. The expansion is already finished in a couple of iterations maybe even in the first 1M. This is just at the 19th iterations out of 100 and it already beats the best I ever had. It is on its way to a 1.1e12 variance. That is for a perft computed 10x less often,from which the ratio of visits to each child is determined, which you would expect to increase variance since I was consuming 2x amount of nodes but the opposite happened.
Either variance back-propagation has problems or this method significantly reduces variance . I am badly in need of confirmation.

I will redo the experiment starting from a fully expanded depth=4 tree to see if the _little_ amount of splitting done here is the problem but I doubt it.

Oh btw using ply < 5 is like not using it at all for a 100mil simulation limit. I am getting almost the same result.

Code: Select all

------------ 19 --------------
0 h2h3   1.789764e+015 +- 1.545989e+012 1.799388e+015 +- 6.372870e+011    562047      8457
0 h2h4   2.597051e+015 +- 1.838673e+012 2.627847e+015 +- 7.750630e+011    795004      9329
0 g2g3   2.474903e+015 +- 1.788144e+012 2.499154e+015 +- 7.428794e+011    751911      9345
0 g2g4   2.195935e+015 +- 1.655045e+012 2.209957e+015 +- 6.872392e+011    644141      9328
0 f2f3   1.531590e+015 +- 1.383416e+012 1.544761e+015 +- 5.679566e+011    450055      8457
0 f2f4   2.027310e+015 +- 1.643477e+012 2.079814e+015 +- 6.873335e+011    635168      8929
0 e2e3   7.137330e+015 +- 3.074284e+012 7.131212e+015 +- 1.238727e+012   2222535     13134
0 e2e4   7.243093e+015 +- 3.081155e+012 7.269241e+015 +- 1.238179e+012   2232483     13160
0 d2d3   4.566226e+015 +- 2.292394e+012 4.586694e+015 +- 9.377974e+011   1235774     11959
0 d2d4   6.307939e+015 +- 2.841795e+012 6.347772e+015 +- 1.149333e+012   1899092     12435
0 c2c3   2.734648e+015 +- 1.902546e+012 2.737344e+015 +- 7.814243e+011    851199      9272
0 c2c4   3.099784e+015 +- 1.997836e+012 3.114230e+015 +- 8.166066e+011    938599      9744
0 b2b3   2.381318e+015 +- 1.765490e+012 2.431944e+015 +- 7.361962e+011    732980      9345
0 b2b4   2.388839e+015 +- 1.767280e+012 2.395950e+015 +- 7.280229e+011    734465      9332
0 a2a3   1.801649e+015 +- 1.556979e+012 1.826304e+015 +- 6.556620e+011    570068      8457
0 a2a4   2.552336e+015 +- 1.824559e+012 2.564241e+015 +- 7.459759e+011    782845      9329
0 g1h3   2.109354e+015 +- 1.653995e+012 2.134619e+015 +- 6.822970e+011    643323      8881
0 g1f3   2.681181e+015 +- 1.811460e+012 2.702902e+015 +- 7.531710e+011    771646      9748
0 b1c3   2.703905e+015 +- 1.936624e+012 2.726999e+015 +- 7.965778e+011    881964      9755
0 b1a3   2.078523e+015 +- 1.681251e+012 2.108154e+015 +- 6.982210e+011    664701      8885
Perft 6.240268e+016 +- 8.988691e+012  6.283853e+016 +- 3.685740e+012
wasted 0.00
6.240545e+016 +- 8.966540e+012  6.283846e+016 +- 3.676064e+012
6.240805e+016 +- 8.944016e+012  6.283841e+016 +- 3.666372e+012
6.241133e+016 +- 8.922104e+012  6.283838e+016 +- 3.656735e+012
6.241562e+016 +- 8.900580e+012  6.283823e+016 +- 3.647164e+012
6.241830e+016 +- 8.878802e+012  6.283805e+016 +- 3.637569e+012
6.242108e+016 +- 8.856980e+012  6.283855e+016 +- 3.628065e+012

Michel · Post by **Michel** » Sat Aug 13, 2011 6:54 pm

Either variance back-propagation has problems or this method significantly reduces variance .

Is there a moveGen count somewhere in your tables? Without this it is impossible to compare results.

As to the correctness of the standard deviation. I would suggest to simply repeat the test a number of times and to compare with the empirical standard deviation (and check the z value to make sure the estimator is unbiased).

EDIT

I did a random check with your last result

6.283855e+016 +- 3.628065e+012

This has a z value of 4.5. So surely something is wrong with the
way you compute the standard deviation or bias of your estimator.

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 7:16 pm

Ok I will add a counter to the number of move generators. If I had to guess it would be 100mil * (12 - 4 + 1) = 900mil move gen calls. Full width fw = 4.
The visit proportions doesn't seems suspicious does it ?

The above result goes through all the way to 80th iteration with the mean still at 6.284 so it is no wonder the variance is lower. The mean just did not change that much. But it also means it is biased. But why ? It seems I have to update weights frequently to get an unbiased estimate. Peter can also experiment with the same thing I think. It would be interesting if we observe the same phenomenon.

Michel · Post by **Michel** » Sat Aug 13, 2011 7:27 pm

Peter can also experiment with the same thing I think.

But why? Peter's estimator is unbiased by _design_. We have been through this a 100 times.

If you cannot _prove_ your estimator is unbiased then it is _very likely biased_ and you should rethink the design.

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 7:58 pm

The problem is that you guys are stuck on competition and forget that what I am talking about applies to you too! Peter's design is UCT as far as I am concerned. He _splits_ and uses _dynamic_ adjustments, had observed the same problem before etc... And you still didn't answer my question about why it would be biased. If it is implementation problem it will be fixed soon.
I would like to concentrate on working out the details now. Then you can shout I am wrong till you run the test your self and see that I am not

cheers

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 8:17 pm

And just when you start shooting at me, I may have found the bug. It was just that I need to have a duplicate visits count when I decided to update one of the values less frequently. I mean its count should have been visits / 10 for the less calculated perft. It is an obscure bug ..

The bias test remains valid though because I didn't start using second sequence of random numbers for nothing. You and Peter need to explain how you solve that problem.. It doesn't make sense that I am forced to use that but you guys do not when what you do is exactly the same. So I ask you guys here to show me how you solve the problem

Michel · Post by **Michel** » Sat Aug 13, 2011 8:26 pm

Well I am not participating in any competition. I just have an academic
interest in knowing which method performs best.

had observed the same problem before etc...

WITH A DIFFERENT METHOD

And you still didn't answer my question about why it would be biased.

The real question is: why would it be unbiased? An estimator is
far more likely to be biased than to be unbiased. So if you cannot
prove your estimator is unbiased then it is very likely biased.

Other people have explained their methods detail but you never have. So you cannot expect people to help you.

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 8:31 pm

Well then explain the method then we will see how different it is from UCT.
What caused the previous problem and how it got solved will follow next ...

Daniel Shawul · Post by **Daniel Shawul** » Sat Aug 13, 2011 8:34 pm

This is what I got from Peter about his method and my reply for each of his steps. Now is the time for you to speak Michel. He even said his algorithm is quite similar to UCT...

Well what you describe below is exactly UCT with Michel's formula AFAIAC. I had difficulty testing his formula because it lacked
the "visits" count for a node. Your step (3) decides if to split a node without comparing it to other sibling
nodes. To achieve that you needed more heuristic since you do not compare with other siblings to pick the best.
You just decide if a node is worth splitting by itself.
The basic idea of UCT tree search over plain MC (without tree) is to investigate (split) nodes worth splitting i.e bigger nodes for perft.
So when you do that you gain much of what UCT is meant for. The tree will dynamically grow towards the bigger nodes f.i in an LMR tree.
It will grow towards the best looking move for tree search. There are many methods to control how aggressively you want to control the growth of the tree..
I will go through your steps one by one:
Quote:

My algorithm is quite similar to UCT but it is not exactly like UCT. Here is what I do:

1. Start with an in-memory tree with only one node representing the root position.
I start with the root node expanded (20 moves) but that is no difference. For a UCT search, you
have to return a move to play eventually so it is usual to start with a splitted root node.
Quote:

2. Do a random walk starting from the root, until a leaf node is reached. The probability controlling which child to visit are calculated from data in the child nodes. This probably causes some cache misses, but as a compensation I store moves in the tree nodes too, so I don't have to call the move generator until I reach a leaf node.
This is the monte carlo part. A light MC visits all children uniformly while a heavy MC use different heuristics that usually make it slower but could be better overall.
BTW "heavy" and "light" is not my terminology. It is an accepted terminology in the GO programming field.
Quote:

3. When a leaf node is reached, I decide if it should be split. A node is split if there are fewer than 4M nodes in the tree, the node has been visited at least 20 times, the remaining depth is > 1, and the "cost" for this node is at least 90% of the largest cost for a leaf node seen during the last 1000 random walks. "cost" is defined as sqrt(mean^2 + variance).
This is the formula part that you do differently. You do not compare to other siblings.
-> Your first criterion 4M nodes is to not run out of memory. There are two solution for this when used for tree search (a) Store less visited part of the tree on disk
(b) Prune parts of the tree. The latter is not applicable for perft because we need exact estimates.
-> Your second criterion UCT has it too. Mine is I think at 1000 right now. Lowering that should expand the tree more.
-> Your third criterion is Michel's formula a heuristic to have an idea of the cost of the node sqrt(mean^2 + variance)
Quote:

4. If a leaf node is split, child nodes are created for all legal moves.
Same as UCT and a major component because you decide here to visit the bigger nodes. The tree
will be guided to the bigger nodes.
Quote:

5. If a leaf node is not split, I continue with a uniform probability random walk until the "perft" depth is reached. No hashing or tree node lookup is performed in this step. Just move generation and make/unmake moves.
Same. A light MC search. If you start using your previous static heuristic for captures or the HT method, it will become a heavy MC.
Quote:

6. As the recursion unwinds, the mean and variance estimates are updated for all visited in-memory tree nodes.
Same. I do exactly that and is easy for perft. But this back propagation step is more difficult
for tree search. Remi coulom had a complex Backup operator before UCT was invented.
Quote:

7. A single perft estimate computed from the just completed random walk is logged. If 1000 such values have been accumulated, print the average to standard output.

8. Go back to step 2.

Quote:

I optimized my implementation a bit, and now it takes around 57 minutes on a 2.4 GHz Core2 CPU (using only one core) to generate 100e6 tree walks for perft(12). The code currently spends over 60% of the CPU time in a function that removes illegal moves from a pseudo-legal move list. I think this function can be optimized quite a lot though. Currently it just plays all moves and tests which moves leave the king in check.

The source code is here: http://web.comhem.se/petero2home/PerfTE ... 4.java.txt. If you actually want to run it, you also need BitBoard.java from CuckooChess.

I think comparison by number of simulations should be fine. I was under the impression that it was the HT version you tested, in which case time tests would be more appropriate.

Perft(13) betting pool

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far

Re: Summary of Monte Carlo results for perft(12) so far