Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Ajedrecista
Posts: 1372
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Perft(13) betting poll

Post by Ajedrecista » Mon Aug 22, 2011 5:27 pm

Hello again:
Ajedrecista wrote:Hi everybody:

Just for curiosity I tried the method of Adam Hair of adjusting odd BF's with polynomials:

http://talkchess.com/forum/viewtopic.ph ... 16&t=39678

In my case I adjusted with quartic functions (A·x^4 + B·x³ + C·x² + D·x + E) because I splitted in twenty different polynomials, one for each initial move. Firstly I did the calculations with Excel and the sum was around 1.98316e+18 (a little high from my point of view).

I suspected than roundings of Excel 'hurted' a little the estimations, so I adjusted again the twenty polynomials with the help of Derive 6, for getting more digits. I had reason this time, but the final sum is still 'high' in comparison with other computational methods that I have read here. Here are the numbers I got (hoping no errors when doing all the math):

Code: Select all

Move            Estimate

a3	    54,531,136,234,162,156
a4	    79,063,508,502,158,903
b3	    72,639,688,816,621,455
b4	    73,709,067,003,701,511
c3	    86,056,827,153,858,571
c4	    97,970,589,000,615,670
d3	   141,383,458,087,446,826
d4	   213,018,757,033,778,459
e3	   240,725,195,611,700,379
e4	   246,589,794,037,145,382
f3	    43,493,804,033,203,347
f4	    61,162,666,611,330,980
g3	    76,232,972,134,254,158
g4	    65,814,044,230,043,052
h3	    54,049,205,191,901,014
h4	    80,829,651,636,234,054
Na3	   63,795,543,395,998,863
Nc3	   84,531,400,071,549,975
Nf3	   82,344,973,694,923,533
Nh3	   64,898,200,828,736,569
-------------------------------
SUM = 1,982,840,483,309,364,857
Although the number is a little high IMO, I also think that the relative percentages of this estimate (for each move) are fairly good in view of the trend that each move follows when n grows (I calculated it in two Excel files on July). I wonder what would be the final sum when splitting in the initial moves, using for example Peter1 method with fw = 4 (which seems a star method), and compare the final sum and the counts of each move (also the relative percentages). I encourage everybody to run this, if possible (I do not know if Peter1 method allows splitting in initial moves; I hope yes because all the methods here seem very complete). Good luck.

Regards from Spain.

Ajedrecista.
I tried this time splitting perft values into the initial twenty moves again, but now using Lagrange polynomials:

http://en.wikipedia.org/wiki/Lagrange_interpolation

I studied Lagrange polynomials before but I did not realise of use them. Taking into account branching factors in this way: BF(move_i, n) = [Perft(move_i, n)]/[Perft(move_i, n-1)] where n = 3, 5, 7, 9, 11. Then, for n = 13 I have:

[BF(move_i, n = 13)]* = BF(move_i, n = 3) - 5·BF(move_i, n = 5) + 10·BF(move_i, n = 7) - 10·BF(move_i, n = 9) + 5·BF(move_i, n = 11)

Where I extend move_i to all first posible twenty moves. And finally:

[Perft(move_i, n = 13)]* = [Perft(move_i, n = 12)] · [BF(move_i, n = 13)]*

I quote myself because the results are incredibly similar to the ones I got before, using twenty different quartic functions, one for estimating each BF. Note that here there is an only equation that estimates all BF's. Here are the numbers, hoping no typos:

Code: Select all

Move            Estimate

a3	    54,531,136,245,433,857
a4	    79,063,508,508,444,871
b3	    72,639,688,830,427,074
b4	    73,709,067,010,541,664
c3	    86,056,827,162,078,542
c4	    97,970,589,005,764,498
d3	   141,383,458,092,703,225
d4	   213,018,757,047,938,236
e3	   240,725,195,616,730,295
e4	   246,589,794,029,993,095
f3	    43,493,804,037,847,797
f4	    61,162,666,623,173,938
g3	    76,232,972,139,647,889
g4	    65,814,044,240,929,760
h3	    54,049,205,206,297,538
h4	    80,829,651,642,814,500
Na3	   63,795,543,402,742,775
Nc3	   84,531,400,076,854,934
Nf3	   82,344,973,704,170,665
Nh3	   64,898,200,838,916,087
-------------------------------
SUM = 1,982,840,483,463,451,240
This means that the absolute error on whole Perft(13) estimates using quartic functions and Lagrange interpolation (when dividing into the first twenty moves and estimating BF's) is only 154,086,383 ~ 1.54e+8, which is ridiculous! I did not expected such a small absolute error between them, so if somebody knows why it happens, it will be very appreciated.

The estimates for each move_i are slightly greater when using Lagrange interpolation, except on 1.- e4, here is the only move that its estimate is slightly greater when using quartic functions. I do not know why. I still prefer my original estimate of (1.9804681 ± 0.0002024)e+18 with my own original (and imaginative) method (if it could be named in this way), and these two last estimates by mine should be understood as attempts, and not serious estimates. Remember that all my estimates were not computational ones, searching in the game tree, pruning moves or whatever else. I 'play' with numbers, trends... nothing scientifical, of course!

I have seen that the topic is a little dead these days... maybe people is on holiday? Or there are not new ideas? Maybe all that should be said is said? It is a pity because the topic was so good and instructive. Anyway, thanks for all people contributing here. I hope this thread will not be dead anymore.

Regards from Spain.

Ajedrecista.

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 6:43 pm

Re: Perft(13) betting pool

Post by sje » Mon Aug 22, 2011 10:12 pm

The first draft 12 results should appear in mid October. The first of these will likely be for the opening moves 1 a3 and 1 Na3.

User avatar
smrf
Posts: 484
Joined: Mon Mar 13, 2006 10:08 am
Location: Klein-Gerau, Germany
Contact:

Re: Perft(13) & (14)

Post by smrf » Mon Aug 22, 2011 11:27 pm

Hi Sven,
Sven Schüle wrote:... actually it seems that the perft(13) estimate in your table is "almost independent" from that number "-2587,643803". Nearly always I get a result of approximately 1,98E+18 regardless whether I change that "-2587,..." number into -1000, +1000000000, or to -33333. The differences seem to affect the less significant digits beyond "1,98" only. I can't explain why, maybe you have an explanation?
well, here you are inside of an inverse interpolation. Thus, the bigger (abs value) an expression is, the smaller its impact to the left will be here.

Regards, Reinhard.

Michel
Posts: 1966
Joined: Sun Sep 28, 2008 11:50 pm

Re: Perft(13) betting poll

Post by Michel » Tue Aug 23, 2011 7:33 am

I have seen that the topic is a little dead these days... maybe people is on holiday? Or there are not new ideas?
Well I am not sure if Daniel would agree but it seems that for chess perft
trees the longer you make the cycles the better you can make the
sd*sqrt(moveGen) metric.

This is already seen by the fact that fw=6 is better than fw=5 which itself is
better than fw=4.

You can interpolate somewhat smoothly between different fw's by doing
more than one Monte Carlo search at the tips of the fw tree.

So it seems that optimizing the sd*sqrt(moveGen) metric is not well defined.
It depends on how long you want to make the cycles (and also on how
much memory you can use). So in fact we are comparing apples and
oranges (as Daniel was saying in the beginning).

For me Peter1 remains the best compromise between simplicity of
implementation and results.

I am sure though that for artificial trees the adaptive
methods that have been discussed will have a decisive advantage
(Peter was saying for example that Peter1 performs very badly on heavily
LMR'ed trees).

Daniel Shawul
Posts: 3504
Joined: Tue Mar 14, 2006 10:34 am
Location: Ethiopia
Contact:

Re: Perft(13) betting poll

Post by Daniel Shawul » Tue Aug 23, 2011 2:54 pm

Michel wrote:
I have seen that the topic is a little dead these days... maybe people is on holiday? Or there are not new ideas?
Well I am not sure if Daniel would agree but it seems that for chess perft
trees the longer you make the cycles the better you can make the
sd*sqrt(moveGen) metric.

This is already seen by the fact that fw=6 is better than fw=5 which itself is
better than fw=4.

You can interpolate somewhat smoothly between different fw's by doing
more than one Monte Carlo search at the tips of the fw tree.

So it seems that optimizing the sd*sqrt(moveGen) metric is not well defined.
It depends on how long you want to make the cycles (and also on how
much memory you can use). So in fact we are comparing apples and
oranges (as Daniel was saying in the beginning).
Well I agree to some extent. My second addition i.e a MC layer with more than one move becomes closer to full width when more moves are considered. But doing one more full width ply is not a solution because it takes really long to generate enough samples. Imagine how much full width depth-7 takes compared to a 6-ply full width + 2-move MC perft. I would definitely choose the later.
For me Peter1 remains the best compromise between simplicity of
implementation and results.
Well obviously Peter1 looses in an apple-to-apple comparison if I just store the first 5 plies in memory and guide the search. My second improvement 2-move MC maybe debatable but is still a win when you don't have time to do the next full width ply.
I am sure though that for artificial trees the adaptive
methods that have been discussed will have a decisive advantage
(Peter was saying for example that Peter1 performs very badly on heavily
LMR'ed trees).
I don't remember Peter saying that but I have repeatedly said LMR and other _real_ (not artificial) trees are a problem to the Peter1 method. It was just a lucky combination that resulted in a better performance (almost uniform tree + inherent tree guiding with sub-tree node counts), which even escaped us until I did Peter vs HG Monte carlo tests. But even if we take out the tree part(supposedly the "hard" part) , Peter1 still gets beaten by the additional selective layer which btw HG's method failed to achieve even though it does something similar in essence. Also except for the tree growth full-width + selective_ply + MC are all in one function I used to have for the MC only. So mine is really simple too! If you are really in to simplicity, at least for the current competition a pure MC with one move, is a good candidate. The tree is very uniform so it matches the MC distribution. But if that was really the goal, then we shouldn't have started those discussions on improvements. Beauty is in the eye of the beholder :)

User avatar
Ajedrecista
Posts: 1372
Joined: Wed Jul 13, 2011 7:04 pm
Location: Madrid, Spain.
Contact:

Re: Perft(13) betting poll

Post by Ajedrecista » Tue Aug 23, 2011 3:45 pm

Hello:

To Steven: thanks for replying your estimates of when the first draft 12 result will appear. In view of some draft 11 results, I guess that the four first draft 12 will be a3, b3, Na3 and Nc3 (but not necessarily in this order). All chess community appreciate so much your effort running Perft(13) count. It would be nice that Paul Byrne also runs this for comparing results (according to him, once the count starts, it should end in a month or so):

http://talkchess.com/forum/viewtopic.ph ... 53&t=39678

To Michel: I suppose that increasing moveGen gives better result (less error). I also think that, regarding fw: 'the more the merrier' (altough I imagine that increasing fw has a cost in time, RAM, ... that few people can assume). For what I understood reading this topic, reduce sdn = sd·sqrt(moveGen) is preferable because, given an ammount of moveGen, less sdn means less sd (which is good) and less sd means less variance, and a 'measure' of efficiency you noted (variance · time) would be improved. But it seems difficult of optimize and here I can only say: 'good luck'.

Peter1 still resembles a star method in view of the comments of everybody, so congrats to Peter for bringing such a good idea. Sorry to read that Peter1 performs bad on heavily LMR's trees: I only know that LMR means 'Late Move Reductions' and they could be useful; despite this issue, Peter1 is still a great method. I have no clue about what could be artificial trees. I also do not know what are you referring about when you say adaptive methods. Maybe that do not chose moves uniformly? If I am right (near zero probability) I agree with you, just take a look on BF's: Perft(13)/Perft(12) will be slightly over 31.5, but not all the initial moves have this value of BF. I calculated the trend of percentages of each initial move and they evolve quite differently. According with my estimates, BF(e4) will be around 34 while BF(f3) will be near 28 (the rest of BF's should be between 28 and 34). So, chosing moves with probability = 1/32 works quite well but it could be improved a little (but I guess it is complex).

As you see, I have almost no clue about what you wrote, but I try to be useful as much as possible (i.e. 0.01%), here with BF's. Thanks for replying.

To everybody: I forgot to say that in my quartic functions: n = 2x + 5. But I have to be careful because different values of x gives different estimates (although not very different). It is not the same picking n = 2x + 5, or n = 2x + 1 (for example), or x = n, or x = ln(n), or for example n = ln(10x + 7)... adjusting with polynomials works fine but keeping in mind this 'scale issue' (for giving it a name) of the independent variable. Simply I have to chose reasonable values for x. And what are reasonable values? ... The ones I write before should be valid, they are countless.

Lagrange interpolation seems easier to me. Maybe the results were so close between them because the Lagrange polynomial I got has degree 4, the same as the quartic functions.

Regards from Spain.

Ajedrecista.

User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 6:43 pm

A need for verificaton

Post by sje » Tue Aug 23, 2011 10:04 pm

With calculations that last from weeks to months, there is a need for verification of the results. I'm fairly confident that my program is correct as it successfully produced perft(12) twice: once via distributed computing and once on a single machine in a single run.

However, there just might be some obscure, hidden coding error; this has happened before. More likely in my opinion is the possibility of machine error; I am using consumer level hardware with memory that does not have error correcting codes. To guard against this possibility I purposely do not overclock any of the hardware. Further, instead of using a consumer class CPU cooling system, I installed an extra heavy duty big ass fan and heatsink and this seems to help.

In any case, I have posted various intermediate results so that others may check my work. If Paul Byrne produces perft(13) before me, then that's okay as I'll continue my calculation to verify his work. And if I can come up with a result first, I hope that he'll continue his work to check mine.

Post Reply