Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Perft(13) betting pool

Post by petero2 »

Daniel Shawul wrote:Can you also print out the means ? For example after 1 million games, starting Monte carlo from the root, I get around 1.9806e18 with uniform PRNG. Can you do an unbiased estimate using weighing _all_ captures by 80%,90% 95% 100% 105% 110% , so that we can see how the mean is affected ?
OK, I got this:

Code: Select all

cwt   sMean        sStd
 80   1.9811e+18   3.3313e+15
 90   1.9810e+18   3.3576e+15
 95   1.9865e+18   3.3364e+15
100   1.9846e+18   3.3611e+15
105   1.9777e+18   3.3465e+15
110   1.9811e+18   3.3725e+15
Here "cwt" is the weight for captures, sMean is the sampled mean value after 1e6 random walks from the root position, and sStd is the sampled standard deviation for those walks.

This limited test doesn't show any significant correlation between cwt and sMean, given the size of sStd. A more ambitious experiment could run the above test a large number of times, then create one histogram of the sMean for each value of cwt.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

Sven Schüle wrote:
Daniel Shawul wrote:Sven,
Uri is saying that he did not mean to count shorter games when he said count legal games. I see no where where he tells that shorter games should not be included but now he is claiming that is what he had in mind.
Did you increment the legal games count when shorter games are encountered or even the total games count. With the bit higher estimate i.e 1.997e18, I thought that could be the case.
My implementation counts shorter games as "illegal" games (i.e. "does not count them as legal"), and therefore also increments the total games counter for these. This is how I understood the initial description given by Uri, and this is also what I think is most natural. But it may be wrong, or not optimal, I don't know yet.
That is not you. I understood it that way too. This is the second time Uri writes something and says something else when I explained to him. Any game gets counted either as illegal or legal. How are we supposed to figure out we should never consider shorter games, only he knows.

Later I figured out that either those are counted separately at each ply and multiplied with their own Upper bounds OR should not be counted as games all. The later means you just assume shorter games never happened, both legal and total counts remain the same. If you increment even one of them it becomes _biased_. So now he is saying he meant the latter.
I think there is some confusion about the term "counting" in this context. The proposal you made a couple of posts above seems to imply that shorter games should not affect both the count of legal games and the total count of games. Could you please explain again why you expect this to be better than not doing so? You asked me to implement it but I did not do it so far, and would like to understand it first.
I don't know if it will change the result that much or not, but you could try the second method above which assumes the game never occurred when it ends prematurely. If a game ends at ply 4 for example, you are going to multiply them with an upper bound of U(1) * U(2) ... U(13), when in reality it should have been U(1)*U(2)*U(3). This can cause great inflations as I have shown in my previous example.
How does Peter's method deal with shorter games?

Should we even consider to exclude game-terminating moves from the list of generated moves (which might become expensive) so that our random games are always full-length games?
Yes just ignore the game when it becomes terminated before ply 13 is reached for Perft(13). I don't even know how many of those we have anyway ,but they could make the estimate biased.
One more note regarding the estimate of 1.997...e+18 I published a while ago after quickly implementing the method of Uri in my engine. There are at least two factors influencing the accuracy of that number:

1) the low number of sample games I was able to play so far,
2) the fact that the perft(1) upper bound values U(i) are only approximated values u(i) <= U(i) for all i > 7 (or i > 8, I don't remember).

That might explain why this estimate is different from some other estimates given here. The value is simply not accurate enough yet. My CPU time available for such tasks is limited, so please be patient ...

Sven
Those may be possible reasons. I am not sure if the change I proposed will fix it but I am sure the method was biased as we understood it.
Take your time please. As you see I am busy getting bashed :)

cheers
Last edited by Daniel Shawul on Wed Jul 27, 2011 12:54 am, edited 1 time in total.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

petero2 wrote:
Daniel Shawul wrote:Can you also print out the means ? For example after 1 million games, starting Monte carlo from the root, I get around 1.9806e18 with uniform PRNG. Can you do an unbiased estimate using weighing _all_ captures by 80%,90% 95% 100% 105% 110% , so that we can see how the mean is affected ?
OK, I got this:

Code: Select all

cwt   sMean        sStd
 80   1.9811e+18   3.3313e+15
 90   1.9810e+18   3.3576e+15
 95   1.9865e+18   3.3364e+15
100   1.9846e+18   3.3611e+15
105   1.9777e+18   3.3465e+15
110   1.9811e+18   3.3725e+15
Here "cwt" is the weight for captures, sMean is the sampled mean value after 1e6 random walks from the root position, and sStd is the sampled standard deviation for those walks.

This limited test doesn't show any significant correlation between cwt and sMean, given the size of sStd. A more ambitious experiment could run the above test a large number of times, then create one histogram of the sMean for each value of cwt.
Thanks.
Yes you can't say anything about the mean or the variance being improved or degraded. More games would be needed to bring the variance down so we can compare two weights. Either way weight selection is not a trivial matter. I would try to implement it into my perft and help you out.

cheers
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Perft(13) betting pool

Post by Michel »

Hi Peter,

Thanks for putting the theory to the test.

But my impression was that one should reduce the weight of captures much more.

I will explain why I think this is so.

The optimal weight p_i for move i is proportional to sqrt(x_i^2+s_i^2).
where x_i is the size of the resulting subtree and s_i^2 is the variance
of the estimator measuring that subtree.

This is a mathematical fact. No assumptions are necessary for
deriving this formula.

Unfortunately we don't know x_i and s_i.
Now I sort of _assume_ that the size of s_i will be of the same order of
magnitude as x_i. This assumption might be wrong. I cannot really
justify it.

This means that p_i should simply be proportional to x_i
(the subtree).

Now if you perform a capture then you permanently reduce the EBF
and the tree size is EBF^r where r is the remaining plies. So the
weight should be 1//(EBF^r). Of course we don't know EBF either
but the important thing is that the weight appears to be a negative exponential.

A check is different. It reduces the EBF only for one ply. So in that
case the weight should be reduced by a constant.

Now a valid objection (which did not came up in this thread) is the following. Captures and checks are a minority among the moves. So it
is not clear that treating them in an optimal way will make a lot
of difference. This is in contrast to the LMR examples we have
been discussing where doing non uniform MC is absolutely beneficial.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Perft(13) betting pool

Post by Michel »

So the weight should be 1//(EBF^r).
Sigh. Of course this is utterly wrong. I forgot about the proportionality factor and for good measure threw in an inverse as well. The 15 minutes limit for editing is really frustrating.

So the weight for a capture should be _proportional_ to EBF^r (the size of the subtree). This means that if u is the ratio between the EBF's for a quiet move and for the capture then
the weight of the capture should be reduced by a factor u^r.
petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Perft(13) betting pool

Post by petero2 »

Michel wrote:Hi Peter,

Thanks for putting the theory to the test.

But my impression was that one should reduce the weight of captures much more.

I will explain why I think this is so.

The optimal weight p_i for move i is proportional to sqrt(x_i^2+s_i^2).
where x_i is the size of the resulting subtree and s_i^2 is the variance
of the estimator measuring that subtree.

This is a mathematical fact. No assumptions are necessary for
deriving this formula.

Unfortunately we don't know x_i and s_i.
Now I sort of _assume_ that the size of s_i will be of the same order of
magnitude as x_i. This assumption might be wrong. I cannot really
justify it.

This means that p_i should simply be proportional to x_i
(the subtree).

Now if you perform a capture then you permanently reduce the EBF
and the tree size is EBF^r where r is the remaining plies. So the
weight should be 1//(EBF^r). Of course we don't know EBF either
but the important thing is that the weight appears to be a negative exponential.

A check is different. It reduces the EBF only for one ply. So in that
case the weight should be reduced by a constant.
I agree with the above reasoning. I actually tested one weight function that looked like this:

Code: Select all

    private final double getWeight&#40;Move m, int depth&#41; &#123;
        double w = MoveGen.givesCheck&#40;pos, m&#41; ? 0.20 &#58; 1;
        int cap = pos.squares&#91;m.to&#93;;
        switch &#40;cap&#41; &#123;
        case Piece.WPAWN&#58; case Piece.BPAWN&#58;
            return w * Math.pow&#40;1.00, depth&#41;;
        case Piece.WKNIGHT&#58; case Piece.BKNIGHT&#58;
            return w * Math.pow&#40;0.95, depth&#41;;
        case Piece.WBISHOP&#58; case Piece.BBISHOP&#58;
            return w * Math.pow&#40;0.90, depth&#41;;
        case Piece.WROOK&#58; case Piece.BROOK&#58;
            return w * Math.pow&#40;0.85, depth&#41;;
        case Piece.WQUEEN&#58; case Piece.BQUEEN&#58;
            return w * Math.pow&#40;0.80, depth&#41;;
        default&#58;
            return w;
        &#125;
    &#125;
I estimated the standard deviation to 2.22e16 for this weight function, so I gave up on that approach. I probably should have tried to tune the weights before giving up though.
Michel wrote:Now a valid objection (which did not came up in this thread) is the following. Captures and checks are a minority among the moves. So it is not clear that treating them in an optimal way will make a lot of difference. This is in contrast to the LMR examples we have been discussing where doing non uniform MC is absolutely beneficial.
That may very well be the case. Another idea could be to base the weight on what square the piece moves to. Maybe a function of the mobility the piece would have on that square on an empty chess board.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Perft(13) betting pool

Post by hgm »

Daniel Shawul wrote:
petero2 wrote:
Daniel Shawul wrote:Can you also print out the means ? For example after 1 million games, starting Monte carlo from the root, I get around 1.9806e18 with uniform PRNG. Can you do an unbiased estimate using weighing _all_ captures by 80%,90% 95% 100% 105% 110% , so that we can see how the mean is affected ?
OK, I got this:

Code: Select all

cwt   sMean        sStd
 80   1.9811e+18   3.3313e+15
 90   1.9810e+18   3.3576e+15
 95   1.9865e+18   3.3364e+15
100   1.9846e+18   3.3611e+15
105   1.9777e+18   3.3465e+15
110   1.9811e+18   3.3725e+15
Here "cwt" is the weight for captures, sMean is the sampled mean value after 1e6 random walks from the root position, and sStd is the sampled standard deviation for those walks.

This limited test doesn't show any significant correlation between cwt and sMean, given the size of sStd. A more ambitious experiment could run the above test a large number of times, then create one histogram of the sMean for each value of cwt.
Thanks.
Yes you can't say anything about the mean or the variance being improved or degraded. More games would be needed to bring the variance down so we can compare two weights. Either way weight selection is not a trivial matter. I would try to implement it into my perft and help you out.
Well, you kind of inflict this on yourself by doing this test for perft(13), which is as yet unknown. You should have one it for perft(12), so we can compare it to an exact value.

Furthermore, I am no longer sure the term 'unbiased estimate' is meaning the same thing for the both of you. So I have to ask, does 'unbiased estimate' here mean:
1) that all w_i*p_i are 1?
2) that all w_i are constant, and only the p_i vary?
3) that all p_i are constant (equal to 1/N), and only the w_i vary?
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Perft(13) betting pool

Post by Michel »

Furthermore, I am no longer sure the term 'unbiased estimate' is meaning the same thing for the both of you. So I have to ask, does 'unbiased estimate' here mean:
1) that all w_i*p_i are 1?
2) that all w_i are constant, and only the p_i vary?
3) that all p_i are constant (equal to 1/N), and only the w_i vary?
Peter's program assumes w_i*p_1=1 if I read correctly.
petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Perft(13) betting pool

Post by petero2 »

hgm wrote:
Daniel Shawul wrote:
petero2 wrote:
Daniel Shawul wrote:Can you also print out the means ? For example after 1 million games, starting Monte carlo from the root, I get around 1.9806e18 with uniform PRNG. Can you do an unbiased estimate using weighing _all_ captures by 80%,90% 95% 100% 105% 110% , so that we can see how the mean is affected ?
OK, I got this:

Code: Select all

cwt   sMean        sStd
 80   1.9811e+18   3.3313e+15
 90   1.9810e+18   3.3576e+15
 95   1.9865e+18   3.3364e+15
100   1.9846e+18   3.3611e+15
105   1.9777e+18   3.3465e+15
110   1.9811e+18   3.3725e+15
Here "cwt" is the weight for captures, sMean is the sampled mean value after 1e6 random walks from the root position, and sStd is the sampled standard deviation for those walks.

This limited test doesn't show any significant correlation between cwt and sMean, given the size of sStd. A more ambitious experiment could run the above test a large number of times, then create one histogram of the sMean for each value of cwt.
Thanks.
Yes you can't say anything about the mean or the variance being improved or degraded. More games would be needed to bring the variance down so we can compare two weights. Either way weight selection is not a trivial matter. I would try to implement it into my perft and help you out.
Well, you kind of inflict this on yourself by doing this test for perft(13), which is as yet unknown. You should have one it for perft(12), so we can compare it to an exact value.

Furthermore, I am no longer sure the term 'unbiased estimate' is meaning the same thing for the both of you. So I have to ask, does 'unbiased estimate' here mean:
1) that all w_i*p_i are 1?
2) that all w_i are constant, and only the p_i vary?
3) that all p_i are constant (equal to 1/N), and only the w_i vary?
I use the common statistical definition that "unbiased estimate" means that the expected value of the estimator is equal to the parameter being estimated, as for example defined here: http://en.wikipedia.org/wiki/Bias_of_an_estimator

My program uses option 1), which has already been proven to give an unbiased estimator according to the above definition.

I realize I introduced possibly confusing terminology when I called my function getWeight though. This function returns a number that is proportional to the selection probability, but the actual probability is only computed after getWeight has been called for all possible moves in a position, so that the proper normalization constant can be calculated. Maybe this quantity should be called "unscaled probability" instead?
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Perft(13) betting pool

Post by Michel »

I estimated the standard deviation to 2.22e16 for this weight function, so I gave up on that approach.
However this is still better than uniform MC isn't it? Is this for 10^6 games?

Anyway if one really wants to tune weights one shouldn't one have
error bars for the variance as well (to avoid drawing unsound conclusions)? Does anyone know how to give a sensible estimate for these?

I can see a complicated statistical computation for doing this but that must be overkill.