Perft(13) betting pool
Moderators: hgm, Rebel, chrisw
-
- Posts: 27814
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
I still believe it will be very hard to beat the "fixed move-acceptance probability approach", (if that can be beaten at all).
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
I think it can be beaten asymptotically. It is not clear if it can be beatenI still believe it will be very hard to beat the "fixed move-acceptance probability approach", (if that can be beaten at all).
for realistic perrft values, given the trivial implementation.
A while ago I computed the propagation of (nodes visited)*variance for both your
method and Peter's method with uniform sampling for a uniform tree and the
propagation in Peter's case was better.
This was based on a pure implementation of your method though. If I understood correctly you keep track in your computation of the number of accepted moves at each ply
I don't know how this affects the result.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Well the variance reduction methods in the full width part has clearly been improved compared to uniform depth expansion. It is _complimentary_ with the variance reduction methods in the selection (Monte carlo) part of the tree so everyone can benefit from it. You may be right with your claim for the selective part as it is not undoubtedly proven a pure MC (Peter's approach) is better than what you do.
What I am planning to do in the MC part is slightly different. I am going to run two or more perfts for a leaf node. A variance reduction idea I got known as "using anthitheic variables" (am sure screwed up the spelling there). The idea is for a monotone function, using negatively correlated sequence of random numbers can reduce variance upto 70% or so I read. For an ordered tree like in an LMR , this may work if we have good move ordering based on tree size.
I don't know much about the method so if any one has an idea on the workability of the method.
What I am planning to do in the MC part is slightly different. I am going to run two or more perfts for a leaf node. A variance reduction idea I got known as "using anthitheic variables" (am sure screwed up the spelling there). The idea is for a monotone function, using negatively correlated sequence of random numbers can reduce variance upto 70% or so I read. For an ordered tree like in an LMR , this may work if we have good move ordering based on tree size.
I don't know much about the method so if any one has an idea on the workability of the method.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Nothing. Back propagating the weights seemed more elegant. But apparently hard to do for a directed graph which is not a tree.What is wrong with calculating the weights based back propagated mean and variance instead of back propagating the weights directly (unless I misunderstood you) ?
-
- Posts: 27814
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
Well, that makes a large difference, because it effectively changes the process into deciding which of the moves you pick, rather than whether you pick a move. So the variance of the single probe will become the intrinsic variance of the sub-tree sizes. While in the unnormalized case it would be the variance of the sum of N draws with 1/N acceptance probability, which is N times the variance of a single draw with distributionMichel wrote:This was based on a pure implementation of your method though. If I understood correctly you keep track in your computation of the number of accepted moves at each ply. I don't know how this affects the result.
P(0) = 1 - 1/N
P(i) = 1/N * P(x=i)
The latter is 1/N SUM_i { P(x=i)*i^2 } - (1/N SUM_i { P(x=i)*i } )^2
= 1/N ( SUM_i { P(x=i)*i^2 } - (SUM_i { P(x=i)*i } )^2} + (1 - 1/N) * (SUM_i { P(x=i)*i } )^2 )
= 1/N var(x) + (1/N - 1/N^2) E^2(x).
So we get var(x) + (1 - 1/N)*E^2(x), which for large N means the variance get dominated by E(x). So if the frontier nodes have 20-30 moves, i.e. SD = 5, E(x) = 25, the unnormalized result has a 5 times larger SD than the normalized result, for the ply leading to them.
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Yes I had noticed that for the pure method the result you get is badly influenced by E(X).
So what do you do precisely?
Instead of multiplying the end count by 32 for each selective ply you multiply by (total moves/accepted moves) for every selective ply?
So what do you do precisely?
Instead of multiplying the end count by 32 for each selective ply you multiply by (total moves/accepted moves) for every selective ply?
-
- Posts: 27814
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Defect of a node?
Exactly.
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
I think it is easier to resolve this issue by running tests. Can you run a test one with your default acceptance probability of 1/32 ,and another with Peter's modification of selecting exactly one random move and multiply by the number of legal moves? The test should use a fixed time control selected in such a way that it is equivalent to the time it takes you to run 100 mil simulations. One test starting from depth=0 and another one from depth=5. Is that alright ?
-
- Posts: 2272
- Joined: Mon Sep 29, 2008 1:50 am
Re: Defect of a node?
Ok I see. Yes this makes a big difference.
It seems a little bit tricky to analyze this recursively when there are several selective plies in a row (because of the non local character of e.g. "accepted moves").
It seems a little bit tricky to analyze this recursively when there are several selective plies in a row (because of the non local character of e.g. "accepted moves").
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Defect of a node?
Yes it is very difficult since it has a square root term in there. I am relying on the central limit theorem when I back propagate variances. May be there is some thing that can be improved there so I am re-reading cg2006 paper of Remi.Michel wrote:Nothing. Back propagating the weights seemed more elegant. But apparently hard to do for a directed graph which is not a tree.What is wrong with calculating the weights based back propagated mean and variance instead of back propagating the weights directly (unless I misunderstood you) ?