Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Summary of Monte Carlo results for perft(12) so far

Post by petero2 »

hgm wrote:The problem I have with adaptive methods is that there is no information available to adapt on, unless you allow on average more than one move per node to be accepted. And this would explode a deep perft tree very rapidly, using almostt all nodes in the final plies, drawing away the nodes from the early plies, which seems to defeat the purpose.
This is an interesting observation. Near the root of the tree, there seems to be no point in adapting, because a full-with search seems to be just as good as adaptation. Near the leafs (ply 12) of the tree, it seems hard to adapt because there is too little information to adapt on.

However, I think there is a way around that. One can use something similar to the the history heuristic. This will allow information gathered from one part of the search tree to be applied to a different part of the tree, even if the corresponding positions are not the same. Using exponentially decaying memory, an idea already used in CuckooChess, I already have an implementation that beats "Peter1(12,3)" using moveGen metric, but not using CPU time metric. I will try to tune it more, but the basic method is like this:

1. Do fw plys of full-with search.
2. For each leaf node, do a random walk.
3. For the first h plys of the random walk, use non-uniform weights computed from history information.
4. For the remaining part of the random walk, use uniform weights.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Summary of Monte Carlo results for perft(12) so far

Post by Michel »

Code: Select all

One can use something similar to the the history heuristic.
I have thought of that. It would mean attaching weights to the moves
rather than to the positions. This is is eerily similar to the Vegas
method in Monte Carlo integration where you approximate a distribution
g(x_1,....,x_n) by a product of distributions g_1(x_1)....g_n(x_n).
Storing the latter requires much less memory.

http://en.wikipedia.org/wiki/VEGAS_algorithm
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: Summary of Monte Carlo results for perft(12) so far

Post by Michel »

I just move on with MC simulation.
I am saying your are not allowed to use the data in that node (the data which participated in the splitting decision) for perft computations. If you do your estimator will be biased.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Summary of Monte Carlo results for perft(12) so far

Post by Daniel Shawul »

Michel wrote:
I just move on with MC simulation.
I am saying your are not allowed to use the data in that node (the data which participated in the splitting decision) for perft computations. If you do your estimator will be biased.
Well that is what we observed. Apparently a second sequence of random numbers fixes it unless I am not allowed to use that too.. Bear in mind you do re use that data for computing your multipliers which is not to be overlooked.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

A few more records

Post by sje »

The perft(13) run has been going for just over 12 days and has produced 210 draft ten records. Here are the most recent ten of these:

Code: Select all

rnbqkbnr/1ppppppp/8/p7/1P6/2P5/P2PPPPP/RNBQKBNR b KQkq - 0 2 10 154068837107809
rnbqkbnr/1ppppppp/8/p7/2P5/7N/PP1PPPPP/RNBQKB1R b KQkq - 0 2 10 126452584569476
rnbqkbnr/1ppppppp/8/p7/8/1P1P4/P1P1PPPP/RNBQKBNR b KQkq - 0 2 10 197935747435700
rnbqkbnr/1ppppppp/8/p7/8/P2P4/1PP1PPPP/RNBQKBNR b KQkq - 0 2 10 163777451315240
rnbqkbnr/1ppppppp/8/p7/2P5/5N2/PP1PPPPP/RNBQKB1R b KQkq - 0 2 10 161971799868266
rnbqkbnr/1ppppppp/8/p7/P7/3P4/1PP1PPPP/RNBQKBNR b KQkq - 0 2 10 166336655168294
rnbqkbnr/1ppppppp/8/p7/8/N3P3/PPPP1PPP/R1BQKBNR b KQkq - 0 2 10 290223899535804
rnbqkbnr/1ppppppp/8/p7/1PP5/8/P2PPPPP/RNBQKBNR b KQkq - 0 2 10 185603264146369
rnbqkbnr/1ppppppp/8/p7/8/3P3N/PPP1PPPP/RNBQKB1R b KQkq - 0 2 10 181649457773953
rnbqkbnr/1ppppppp/8/p7/8/2N1P3/PPPP1PPP/R1BQKBNR b KQkq - 0 2 10 406608149434105
How well do your estimation algorithms perform on the above?

In about six more days I'll have the next batch of draft 11 records.

The entire run should finish in January 2012.
Michel
Posts: 2272
Joined: Mon Sep 29, 2008 1:50 am

Re: A few more records

Post by Michel »

[d]rnbqkbnr/1ppppppp/8/p7/1P6/2P5/P2PPPPP/RNBQKBNR b KQkq - 0 2 10[/d]

GNU Chess 5.07.172b http://emis.uhasselt.be/cgi-bin/gitweb. ... ;a=summary Peter1 method with fw=4, 501e6 nodes.

sdn=sd*sqrt(nodes) is the normalized standard deviation
mean=1.540774e+14 sd=8.298100e+09 [ 1.540564e+14 <= Perft(10) <= 1.540985e+14 (99% conf.) ] nodes=501262105 sdn=1.857852e+14 time=1931.75s
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: A few more records

Post by Sven »

Michel wrote:GNU Chess 5.07.172b http://emis.uhasselt.be/cgi-bin/gitweb. ... ;a=summary Peter1 method with fw=4, 501e6 nodes.

sdn=sd*sqrt(nodes) is the normalized standard deviation
mean=1.540774e+14 sd=8.298100e+09 [ 1.540564e+14 <= Perft(10) <= 1.540985e+14 (99% conf.) ] nodes=501262105 sdn=1.857852e+14 time=1931.75s
Hi Michel,

in the latest snapshot of GNU Chess I see this code in cmd.c (function cmd_perftmc) for incremental update of mean + variance:

Code: Select all

n++;
delta=perft-mean;
mean=mean+delta/n;
M2+=delta*&#40;perft-mean&#41;;
variance=M2/(&#40;n-1&#41;*n&#41;;
While I perfectly agree to the mean calculation I am not sure about the variance, could you please explain how that works? One point is that I don't know how you get around the division by 0 in the "variance=..." line for the first sample, i.e. when "n++" sets n to 1. But I am also asking since I get different (significantly higher) values for the same "Peter1" method in my program. I thought that the variance of a set X of N samples were defined by a formula like this (note the division by N-1, not N):

Code: Select all

variance&#40;X, N&#41; &#58;= sum&#40;i=1..N&#41; (&#40;X&#40;i&#41; - mean&#40;X, N&#41;)^2&#41; / &#40;N-1&#41;
and therefore I use the following code for incremental update of mean and variance (translated to GNU Chess variable names):

Code: Select all

mean = &#40;n * mean + perft&#41; / &#40;n + 1&#41;;
variance = &#40;n == 0&#41; ? 0.0 &#58; (&#40;n - 1&#41; * variance + pow&#40;perft - mean, 2&#41;) / n;
++n;
I have been unable so far to detect an equivalence between your implementation and mine. So may I ask you what you think is right, and what is wrong?

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: A few more records

Post by Sven »

Sven Schüle wrote:
Michel wrote:GNU Chess 5.07.172b http://emis.uhasselt.be/cgi-bin/gitweb. ... ;a=summary Peter1 method with fw=4, 501e6 nodes.

sdn=sd*sqrt(nodes) is the normalized standard deviation
mean=1.540774e+14 sd=8.298100e+09 [ 1.540564e+14 <= Perft(10) <= 1.540985e+14 (99% conf.) ] nodes=501262105 sdn=1.857852e+14 time=1931.75s
Hi Michel,

in the latest snapshot of GNU Chess I see this code in cmd.c (function cmd_perftmc) for incremental update of mean + variance:

Code: Select all

n++;
delta=perft-mean;
mean=mean+delta/n;
M2+=delta*&#40;perft-mean&#41;;
variance=M2/(&#40;n-1&#41;*n&#41;;
While I perfectly agree to the mean calculation I am not sure about the variance, could you please explain how that works? One point is that I don't know how you get around the division by 0 in the "variance=..." line for the first sample, i.e. when "n++" sets n to 1. But I am also asking since I get different (significantly higher) values for the same "Peter1" method in my program. I thought that the variance of a set X of N samples were defined by a formula like this (note the division by N-1, not N):

Code: Select all

variance&#40;X, N&#41; &#58;= sum&#40;i=1..N&#41; (&#40;X&#40;i&#41; - mean&#40;X, N&#41;)^2&#41; / &#40;N-1&#41;
and therefore I use the following code for incremental update of mean and variance (translated to GNU Chess variable names):

Code: Select all

mean = &#40;n * mean + perft&#41; / &#40;n + 1&#41;;
variance = &#40;n == 0&#41; ? 0.0 &#58; (&#40;n - 1&#41; * variance + pow&#40;perft - mean, 2&#41;) / n;
++n;
I have been unable so far to detect an equivalence between your implementation and mine. So may I ask you what you think is right, and what is wrong?

Sven
Hmm, it seems it is close to impossible to detect that equivalence since my implementation is wrong :oops: I ignored the fact that variance(n) refers to mean(n) but variance(n+1) must refer completely to mean(n+1). Therefore it does not work the way I did it.

I also have an idea why your code will not crash with "division by 0". It is due to the first value of M2 being 0, and the compiler might optimize 0/0 simply into 0 (which is mathematically incorrect, though).

I am still struggling to understand why your formula works ...

Sven
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: A few more records

Post by sje »

Michel wrote:[d]rnbqkbnr/1ppppppp/8/p7/1P6/2P5/P2PPPPP/RNBQKBNR b KQkq - 0 2 10[/d]

GNU Chess 5.07.172b http://emis.uhasselt.be/cgi-bin/gitweb. ... ;a=summary Peter1 method with fw=4, 501e6 nodes.

sdn=sd*sqrt(nodes) is the normalized standard deviation
mean=1.540774e+14 sd=8.298100e+09 [ 1.540564e+14 <= Perft(10) <= 1.540985e+14 (99% conf.) ] nodes=501262105 sdn=1.857852e+14 time=1931.75s
The average time for calculating a draft 10 record in the current run is about 5,013 seconds. It would be interesting to see the quality of estimations that can be computed in, say, one percent of the actual full calculation time.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Summary of Monte Carlo results for perft(12) so far

Post by Daniel Shawul »

I think the reason for the sudden jump in performance of full_width +MC is because up to now for depth <= 3, it was same as _uniform_ allocation. But for depth = 5 and more it becomes proportioning by sub-tree nodes count up to leaf nodes. It was a sub-tree count proportioning all along but we didn't notice it because sub-trees are uniform for depth<=3. So with that in mind it can be beaten f.i if I do full-width expansion to depth=5. Depth=6 would require too much memory though , which is a disadvantage for any breadth first searcher. Having said that it can perform bad for a selective tree where the shape of the upper portion of the tree is not related with the sub-tree size. Another idea applicable for the perft(13) competition would be to take the exact perft(12) results for 2-plies from the root and use that to determine the proportions for the monte-carlo simulations. The depth-first search does a static allocation, so a static allocation based on the real sub-tree size may perform better.