Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27814
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Defect of a node?

Post by hgm »

I still believe it will be very hard to beat the "fixed move-acceptance probability approach", (if that can be beaten at all).
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Defect of a node?

Post by Michel »

I still believe it will be very hard to beat the "fixed move-acceptance probability approach", (if that can be beaten at all).
I think it can be beaten asymptotically. It is not clear if it can be beaten
for realistic perrft values, given the trivial implementation.

A while ago I computed the propagation of (nodes visited)*variance for both your
method and Peter's method with uniform sampling for a uniform tree and the
propagation in Peter's case was better.

This was based on a pure implementation of your method though. If I understood correctly you keep track in your computation of the number of accepted moves at each ply
I don't know how this affects the result.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Defect of a node?

Post by Daniel Shawul »

Well the variance reduction methods in the full width part has clearly been improved compared to uniform depth expansion. It is _complimentary_ with the variance reduction methods in the selection (Monte carlo) part of the tree so everyone can benefit from it. You may be right with your claim for the selective part as it is not undoubtedly proven a pure MC (Peter's approach) is better than what you do.

What I am planning to do in the MC part is slightly different. I am going to run two or more perfts for a leaf node. A variance reduction idea I got known as "using anthitheic variables" (am sure screwed up the spelling there). The idea is for a monotone function, using negatively correlated sequence of random numbers can reduce variance upto 70% or so I read. For an ordered tree like in an LMR , this may work if we have good move ordering based on tree size.
I don't know much about the method so if any one has an idea on the workability of the method.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Defect of a node?

Post by Michel »

What is wrong with calculating the weights based back propagated mean and variance instead of back propagating the weights directly (unless I misunderstood you) ?
Nothing. Back propagating the weights seemed more elegant. But apparently hard to do for a directed graph which is not a tree.
User avatar
hgm
Posts: 27814
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Defect of a node?

Post by hgm »

Michel wrote:This was based on a pure implementation of your method though. If I understood correctly you keep track in your computation of the number of accepted moves at each ply. I don't know how this affects the result.
Well, that makes a large difference, because it effectively changes the process into deciding which of the moves you pick, rather than whether you pick a move. So the variance of the single probe will become the intrinsic variance of the sub-tree sizes. While in the unnormalized case it would be the variance of the sum of N draws with 1/N acceptance probability, which is N times the variance of a single draw with distribution

P(0) = 1 - 1/N
P(i) = 1/N * P(x=i)

The latter is 1/N SUM_i { P(x=i)*i^2 } - (1/N SUM_i { P(x=i)*i } )^2
= 1/N ( SUM_i { P(x=i)*i^2 } - (SUM_i { P(x=i)*i } )^2} + (1 - 1/N) * (SUM_i { P(x=i)*i } )^2 )
= 1/N var(x) + (1/N - 1/N^2) E^2(x).

So we get var(x) + (1 - 1/N)*E^2(x), which for large N means the variance get dominated by E(x). So if the frontier nodes have 20-30 moves, i.e. SD = 5, E(x) = 25, the unnormalized result has a 5 times larger SD than the normalized result, for the ply leading to them.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Defect of a node?

Post by Michel »

Yes I had noticed that for the pure method the result you get is badly influenced by E(X).

So what do you do precisely?

Instead of multiplying the end count by 32 for each selective ply you multiply by (total moves/accepted moves) for every selective ply?
User avatar
hgm
Posts: 27814
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Defect of a node?

Post by hgm »

Exactly.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Defect of a node?

Post by Daniel Shawul »

I think it is easier to resolve this issue by running tests. Can you run a test one with your default acceptance probability of 1/32 ,and another with Peter's modification of selecting exactly one random move and multiply by the number of legal moves? The test should use a fixed time control selected in such a way that it is equivalent to the time it takes you to run 100 mil simulations. One test starting from depth=0 and another one from depth=5. Is that alright ?
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Defect of a node?

Post by Michel »

Ok I see. Yes this makes a big difference.

It seems a little bit tricky to analyze this recursively when there are several selective plies in a row (because of the non local character of e.g. "accepted moves").
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Defect of a node?

Post by Daniel Shawul »

Michel wrote:
What is wrong with calculating the weights based back propagated mean and variance instead of back propagating the weights directly (unless I misunderstood you) ?
Nothing. Back propagating the weights seemed more elegant. But apparently hard to do for a directed graph which is not a tree.
Yes it is very difficult since it has a square root term in there. I am relying on the central limit theorem when I back propagate variances. May be there is some thing that can be improved there so I am re-reading cg2006 paper of Remi.