Perft(13) betting pool

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

petero2
Posts: 685
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: Perft(13) betting pool

Post by petero2 »

Michel wrote:
I will try the idea to automatically estimate the x_i and the sigma_i during runtime instead.
That would be very interesting (although it is not very clear to me
how to do this).
OK, here is what I did:

I have a hash table with 4 million entries. Given a chess position and remaining depth, a corresponding hash table entry is computed using the zobrist key and the depth value.

In each entry, I store information needed to keep a running count of the mean value and the standard deviation for the perft value of the subtree corresponding to the position and depth. See http://en.wikipedia.org/wiki/Algorithms ... _algorithm for details on how this is implemented. (I also store the hash key in each entry for collision detection.)

To compute sampling probabilities in a given position, I first generate all legal moves, then probe the hash table for all child positions.

1) For child positions that are in the hash table, and contain at least 10 samples, I use the probability weight sqrt(x_i^2+sigma_i^2).

2) For child positions that are not in the hash table, or contain fewer than 10 samples, I use a probability weight that is calculated as an average of probability weights from 1). If there are no weights from 1), set all probability weights to 1.

Compute real probabilities by scaling the probability weights so that their sum is 1.

Select a random move based on the computed probabilities.

Make a recursive call to estimate the size of the subtree corresponding to the selected move, then multiply this number with the inverse of the selection probability to get a tree size estimate for this position.

Update the corresponding hash table entry with the estimated tree size.

If there is a hash index collision, I keep the entry that has the largest estimated subtree size.

If there is a hash key collision (ie two different position+depth pairs have the same hash key), this will not be detected, which will lead to sub-optimal sampling probabilities. However, this shouldn't be a big problem, because the estimate will be unbiased regardless of how the probability weights are chosen.
Michel wrote:Even if it is expensive in CPU time it will still give an indication if a substantial variance reduction is achievable.
I ran a test where I do 1000 random walks from the root position, then print the average value, then repeat in an infinite loop. When using uniform weights, this average-of-1000 has a standard deviation of about 1.05e17. With the algorithm I described above, after about 28 million random walks, the standard deviation for an average-of-1000 is about 4.4e16.

This means that about (1.05/0.44)^2=5.7 times fewer samples are needed for a given desired accuracy. However, the method uses a lot of hash probes, so it is still unclear if it is an improvement in terms of CPU cycles, compared to the uniform weight method.

The code looks like this:

Code: Select all

    private static final class MeanVarEst {
        private int n;
        private double mean;
        private double M2;
        long hashKey;

        MeanVarEst() {
            reset(0);
        }
        final void reset(long hashKey) {
            this.hashKey = hashKey;
            n = 0;
            mean = 0;
            M2 = 0;
        }
        final void addSample(double x) {
            if &#40;n < Integer.MAX_VALUE&#41; &#123;
                n++;
                double delta = x - mean;
                mean += delta / n;
                M2 += delta * &#40;x - mean&#41;;
            &#125;
        &#125;
        final int getNSamples&#40;) &#123;
            return n;
        &#125;
        final double getMean&#40;) &#123;
            return mean;
        &#125;
        final double getVariance&#40;) &#123;
            return &#40;n > 1&#41; ? M2 / &#40;n - 1&#41; &#58; 0;
        &#125;
    &#125;

    final private static int HSIZE = 1 << 22;
    private MeanVarEst&#91;&#93; ht;

    private final long getHashKey&#40;int depth&#41; &#123;
        long ret = pos.zobristHash&#40;);
        return ret + depth;
    &#125;

    private final double getDynWeight&#40;Move m, int depth&#41; &#123;
        UndoInfo ui = new UndoInfo&#40;);
        pos.makeMove&#40;m, ui&#41;;
        long hKey = getHashKey&#40;depth-1&#41;;
        pos.unMakeMove&#40;m, ui&#41;;
        int hIdx = &#40;int&#41;&#40;hKey & &#40;HSIZE-1&#41;);
        MeanVarEst mv = ht&#91;hIdx&#93;;
        if &#40;mv.hashKey != hKey&#41;
            return -1;

        if &#40;doPrint && &#40;depth == 13&#41;)
            System.out.printf&#40;"d&#58;%2d m&#58;%s avg&#58;%g std&#58;%g n&#58;%d\n", depth, TextIO.moveToUCIString&#40;m&#41;,
                    mv.getMean&#40;), Math.sqrt&#40;mv.getVariance&#40;)), mv.getNSamples&#40;));

        if &#40;mv.getNSamples&#40;) < 10&#41;
            return -1;
        double av = mv.getMean&#40;);
        return Math.sqrt&#40;av * av + mv.getVariance&#40;));
    &#125;

    private final void updateHashTable&#40;double nodes, int depth&#41; &#123;
        long hKey = getHashKey&#40;depth&#41;;
        int hIdx = &#40;int&#41;&#40;hKey & &#40;HSIZE-1&#41;);
        MeanVarEst mv = ht&#91;hIdx&#93;;
        if &#40;mv.hashKey != hKey&#41; &#123;
            if (&#40;mv.hashKey != 0&#41; && mv.getMean&#40;) > nodes&#41;
                return; // Better entry already stored in this hash slot
            mv.reset&#40;hKey&#41;;
        &#125;
        mv.addSample&#40;nodes&#41;;
    &#125;

    private boolean doPrint = false;

    private final void doPerfTDynWeight&#40;) throws ChessParseError &#123;
        ht = new MeanVarEst&#91;HSIZE&#93;;
        for &#40;int i = 0; i < ht.length; i++)
            ht&#91;i&#93; = new MeanVarEst&#40;);

        pos = TextIO.readFEN&#40;TextIO.startPosFEN&#41;;
        while &#40;true&#41; &#123;
            double nodes = 0;
            int N = 1000;
//            doPrint = true;
            for &#40;int i = 0; i < N; i++) &#123;
                nodes += perfTDynWeight&#40;13, 0&#41;;
                doPrint = false;
            &#125;
            System.out.printf&#40;"%.0f\n", nodes / N&#41;;
        &#125;
    &#125;

    private final double perfTDynWeight&#40;int depth, int fullDepth&#41; &#123;
        double nodes = 0;
        MoveGen.MoveList moves = moveGen.pseudoLegalMoves&#40;pos&#41;;
        MoveGen.removeIllegal&#40;pos, moves&#41;;
        if (&#40;depth == 1&#41; || &#40;moves.size == 0&#41;) &#123;
            int ret = moves.size;
            moveGen.returnMoveList&#40;moves&#41;;
            return ret;
        &#125;
        UndoInfo ui = new UndoInfo&#40;);
        if &#40;fullDepth <= 0&#41; &#123;
            int N = moves.size;
            double weights&#91;&#93; = new double&#91;N&#93;;
            int avgNum = 0;
            double avgSum = 0;
            double maxWeight = 0;
            for &#40;int i = 0; i < N; i++) &#123;
                double w = getDynWeight&#40;moves.m&#91;i&#93;, depth&#41;;
                weights&#91;i&#93; = w;
                if &#40;w > 0&#41; &#123;
                    avgSum += w;
                    avgNum++;
                    maxWeight = Math.max&#40;maxWeight, w&#41;;
                &#125;
            &#125;
            if &#40;avgNum == 0&#41; &#123; // No dynamic information, use equal weights
                for &#40;int i = 0; i < N; i++)
                    weights&#91;i&#93; = 1;
            &#125; else &#123; // At least some dynamic information, use average weights for moves with no dynamic information
                double avg = avgSum / avgNum;
                for &#40;int i = 0; i < N; i++)
                    if &#40;weights&#91;i&#93; < 0&#41;
                        weights&#91;i&#93; = avg;
                // Avoid too large imbalance
                double minWeight = maxWeight / 50;
                for &#40;int i = 0; i < N; i++)
                    weights&#91;i&#93; = Math.max&#40;weights&#91;i&#93;, minWeight&#41;;
            &#125;

            double wTotal = 0;
            double wAccum&#91;&#93; = new double&#91;N&#93;;
            for &#40;int i = 0; i < N; i++) &#123;
                wTotal += weights&#91;i&#93;;
                wAccum&#91;i&#93; = wTotal;
            &#125;
            double r = rnd.nextDouble&#40;) * wTotal;
            int idx = Arrays.binarySearch&#40;wAccum, r&#41;;
            if &#40;idx < 0&#41; idx = -&#40;idx+1&#41;;

            pos.makeMove&#40;moves.m&#91;idx&#93;, ui&#41;;
            double tmp = perfTDynWeight&#40;depth - 1, fullDepth - 1&#41;;
            pos.unMakeMove&#40;moves.m&#91;idx&#93;, ui&#41;;
            nodes = tmp * wTotal / weights&#91;idx&#93;;
            updateHashTable&#40;nodes, depth&#41;;
        &#125; else &#123;
            for &#40;int mi = 0; mi < moves.size; mi++) &#123;
                Move m = moves.m&#91;mi&#93;;
                pos.makeMove&#40;m, ui&#41;;
                nodes += perfTDynWeight&#40;depth - 1, fullDepth - 1&#41;;
                pos.unMakeMove&#40;m, ui&#41;;
            &#125;
        &#125;
        moveGen.returnMoveList&#40;moves&#41;;
        return nodes;
    &#125;
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

Sorry, Daniel, you are mixing up everything. You are too fast. Your way of argumentation is not fully logical. You refer to the proof for Peter's method which is nowhere doubted, but is completely irrelevant here. We are at the point where you wanted to prove that Uri's estimation method were biased when including nShortLegalGames within nTotalGames. I do not see that proof yet. You are not addressing the points I have made before, e.g. about "unknown probability of prematurely game-terminating legal moves (mate etc.)". You continue to talk about "illegal short games biasing the result" even though exactly the ratio of illegal vs. total games (from which the real result based on the ratio of legal vs. total games) forms the essential part of Uri's method.

What should I do next to make you understand what I wrote, and how Uri's method works?

I am off, too, until tomorrow. Have a nice evening!

Sven
I tried to address them but I am getting confused by the minute..
2) A proof that Uri's method with all clarifications given so far were biased should not refer to formula like this which are not part of Uri's method.
Here is another inconsistent statement. Uri says his method is un-biased but you say it is biased. I am not so sure how the counts are done now We count and increment games differently so maybe Uri can explain.
anyway see you tomorrow.
Last edited by Daniel Shawul on Wed Jul 27, 2011 9:10 pm, edited 1 time in total.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Perft(13) betting pool

Post by Sven »

Daniel Shawul wrote:Uri's tree complicates the calculation
Using the example tree from your drawing and applying Uri's method, the probability to get a legal game of full length (3 plies) when playing an "infinite" number of random games is

Code: Select all

  1/10 * 1/3 *  2/10
+ 1/10 * 1/3 *  3/10

+ 1/10 * 1/3 *  3/10
+ 1/10 * 1/3 *  3/10
+ 1/10 + 1/3 * 10/10

+ 8/10 * 0

= &#40;2 + 3 + 3 + 3 + 10&#41; / 300
= 21 / 300
which leads exactly to the perft(3) value at root for this example if you multiply 21 / 300 by U(1)*U(2)*U(3). So the estimate is not biased in this case :-)

Sven
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

But you just said it is biased :)
Uri Blass
Posts: 10269
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Perft(13) betting pool

Post by Uri Blass »

Daniel Shawul wrote:
Sven Schüle wrote:
Uri Blass wrote:
Daniel Shawul wrote:Here is the formula to count all leaf nodes including those shorter ones

Code: Select all

    Sum = &#40;legal_games&#91;1&#93; / Total_games&#91;1&#93;) * U&#40;1&#41;
         + &#40;legal_games&#91;2&#93; / Total_games&#91;2&#93;) * U&#40;1&#41; * U&#40;2&#41;
         ...
         + &#40;legal_games&#91;13&#93; / Total_games&#91;13&#93;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; ... * U&#40;13&#41;
To count leaf nodes at ply 13 only Uri uses (Note we have one counter for both)

Code: Select all

    Sum = &#40;legal_games / total_games&#41; * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
Now if you increment the legal_games when you have a short win at ply 2, it means we have

Code: Select all

    Sum  = (&#40;legal_games + 1&#41; / &#40;total_games + 1&#41;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
which is wrong!!
And if you count it as illegal which means you increment the total_games count only

Code: Select all

    Sum  = (&#40;legal_games&#41; / &#40;total_games + 1&#41;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
which is again wrong!!
They are both wrong because they are being multiplied by the incorrect upper bound when
it should have been the upper bound perft value for that ply where the game terminated.

The correct _unbiased_ estimator will be to consider that the shorter game never happened i.e don't increment any of the counters

Code: Select all

    Sum  = (&#40;legal_games&#41; / &#40;total_games&#41;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
So he meant he meant this one. Otherwise his estimate will be _biased_

Let me make sure we are on the same page first before we continue.
I meant the last formula but I practically start by a random sequnece
in my method and not by a random game(in the meaning of a game in chess. I call every sequence a game in the total number of games so maybe the definition is confusing)

I have a random sequence 1<=X(1)<=U(1),...1<=X(n)<=U(n) when n=13
Every sequence increase the number of Total games.
I try to see if I can translate all the numbers to legal moves to increase the number of legal games.

If X(1),X(2),X(3),X(4) are translated to f3 e5 g4 Qh4 mate then the full sequence is counted in the total games but not in the legal games of 13 plies because the fact that I have X(5) means that I have a number that I cannot translate to a legal move so practically I increment the total games when I have a short win based on part of the sequence but I do not increment the legal games.
@Uri: I use your post to reply but the contents of my reply is directed mainly towards Daniel. Thanks for your clarification!
-----

@Daniel: this should clean up your slight misunderstanding ;-)
Yeah right ;) But seriously I don't know if you are serious or not based on the questions you ask me below. He basically said we misunderstood him Or admitted to his method being biased (if he increments any of the counters when there is a shorter game) OR he didn't meant to count shorter ones at all.
Apart from that, since I must assume that your reply was a reply to my previous post (I can't check it even in threaded view due to reaching maximum indentation :-( ), what about my other points?

Back to the "short games" issue and to your proposed formula:
- How do you calculate Total_games for i < 13? Is it the number of (legal or illegal) games that ended after i moves?
- And if so, is this also the case for Total_games[13]?
- So is it right that essentially you want to include the estimated number of shorter games within the estimate for perft(13), even though the correct value for perft(13) will not include shorter paths?

Your formula is not obvious to me so I have to insist on pursuing the details.

Furthermore, since we now know that the term "total_games" also includes all shorter games in Uri's method, I do not get the point why you think that (legal_games / total_games) * U(1) * U(2) * ... * U(N) would be wrong somehow. The U(i) values, being maximum possible BFs, do cover all kinds of games, including also shorter ones, don't they? So either we include shorter games on both sides: in total_games and in U(i), or we exclude them on both sides, e.g. by removing all game-terminating moves from the move lists as I proposed above. Your proposal seems to be a mixture IMO.

Sven

I am even confused by his recent reply.

I meant the last formula but I practically start by a random sequence
in my method and not by a random game(in the meaning of a game in chess. I call every sequence a game in the total number of games so maybe the definition is confusing)


So he means this

Code: Select all

    Sum  = (&#40;legal_games&#41; / &#40;total_games&#41;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
This doesn't increment any of them so it is _unbiased_.
Later he says this
If X(1),X(2),X(3),X(4) are translated to f3 e5 g4 Qh4 mate then the full sequence is counted in the total games but not in the legal games of 13 plies because the fact that I have X(5) means that I have a number that I cannot translate to a legal move so practically I increment the total games when I have a short win based on part of the sequence but I do not increment the legal games.
So this means the total games got incremented which implies the formula

Code: Select all

    Sum  = (&#40;legal_games&#41; / &#40;total_games + 1&#41;) * U&#40;1&#41; * U&#40;2&#41; * U&#40;3&#41; .... * U&#40;13&#41;
So it becomes _biased_.

You said you also increment the total games so his method is biased.
So in your next post please select the formula you use and print it out so we avoid any possible confusions.
It seems that you misunderstand me.
I hope to be succesful in explaining my method more clearly and I will try to use more significant names to my variables in this explanation.

There are 2 variables
1)total_games(maybe it is not a good name and I should write total_candidates_to_be_games_of_13_plies so I am going to use this name) 2)legal_games(I am going to use legal_games_of_13_plies to be clear).

1)total_candidates_to_be_games_of_13_plies get incremented for every sequence that I try.
2)legal_games_of_13_plies get incremented only for sequences that I translate to a legal game of 13 plies(so if I cannot translate the last numbers in the sequence to moves
it means that legal_games_of_13_plies does not get incremented)

The formula is simply
Sum = ((legal_games_of_13_plies) / (total_candidates_to_be_games_of_13_plies)) * U(1) * U(2) * U(3) .... * U(13)

I do not see how you get total_candidates_to_be_games_of_13_plies+1 in your formula.

A sequence that starts with numbers that the first 4 numbers are translated to f3 e5 g4 Qh4 mate is a candidate to be translated to a game of 13 plies before making the translation so it increase total_candidates_to_be_games_of_13_plies.
This sequence is not translated to a legal game of 13 plies so it does not increase the legal_games_of_13_plies variable.

Note that my estimate is unbiased.
I will not care about making a formal proof for it in order to try to convince you(it is not very important because people already suggested an unbiased estimate with a smaller variance) but if you think that it biased then you can try to show an example for it when the value of perft is known.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Perft(13) betting pool

Post by Sven »

Daniel Shawul wrote:But you just said it is biased :)
I did not say so. You misread again! I wrote, as you quoted:
2) A proof that Uri's method with all clarifications given so far were biased should not refer to formula like this which are not part of Uri's method.
and hereby referred to your declared intention to provide that proof, saying that your proof that Uri's method were biased as you claimed, should not refer to a formula that is not part of his method.

Do you understand "A proof that method X were Y should ..."? And that this is by 100% different from "I state that method X is Y"? I hope so :-)

Sven
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

Sven Schüle wrote:
Daniel Shawul wrote:But you just said it is biased :)
I did not say so. You misread again! I wrote, as you quoted:
2) A proof that Uri's method with all clarifications given so far were biased should not refer to formula like this which are not part of Uri's method.
and hereby referred to your declared intention to provide that proof, saying that your proof that Uri's method were biased as you claimed, should not refer to a formula that is not part of his method.

Do you understand "A proof that method X were Y should ..."? And that this is by 100% different from "I state that method X is Y"? I hope so :-)

Sven
Hmm Didn't you say good night ? :)
Hard to understand what you are saying here or there. Anyway the method turned out to be unbiased if all games are counted. I checked only the case where the shorter games are counted as legal which makes sense anyway for the first time reader. That will bias it and add to that the 1.997e18 , I offered it as a possible cause by giving an example which can inflate it by about 200%.
Anyway you also questioned the proof before suddenly accepting it as correct when it happens you got it right luckily :)
Anyway sweet dreams now. Bye
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

It seems that you misunderstand me.
I hope to be succesful in explaining my method more clearly and I will try to use more significant names to my variables in this explanation.

There are 2 variables
1)total_games(maybe it is not a good name and I should write total_candidates_to_be_games_of_13_plies so I am going to use this name) 2)legal_games(I am going to use legal_games_of_13_plies to be clear).

1)total_candidates_to_be_games_of_13_plies get incremented for every sequence that I try.
2)legal_games_of_13_plies get incremented only for sequences that I translate to a legal game of 13 plies(so if I cannot translate the last numbers in the sequence to moves
it means that legal_games_of_13_plies does not get incremented)

The formula is simply
Sum = ((legal_games_of_13_plies) / (total_candidates_to_be_games_of_13_plies)) * U(1) * U(2) * U(3) .... * U(13)

I do not see how you get total_candidates_to_be_games_of_13_plies+1 in your formula.

A sequence that starts with numbers that the first 4 numbers are translated to f3 e5 g4 Qh4 mate is a candidate to be translated to a game of 13 plies before making the translation so it increase total_candidates_to_be_games_of_13_plies.
This sequence is not translated to a legal game of 13 plies so it does not increase the legal_games_of_13_plies variable.

Note that my estimate is unbiased.
I will not care about making a formal proof for it in order to try to convince you(it is not very important because people already suggested an unbiased estimate with a smaller variance) but if you think that it biased then you can try to show an example for it when the value of perft is known.
Ok the way you counted it is unbiased and I admit I misunderstood you.
My intention from the beginning was about the weird looking report 1.997e18. I tried to proove equivalence with Peter's method when I saw that it is exactly the same with a much larger tree. I knew that if you count in shorter games, it would inflate the estimate. So I offered it as a possible cause. Btw your original post doesn't say anything about shorter games so I honestly thought they were included.

I also assumed they would affect it the same if we increment the total games too _without checking in with the example_, but it turned out otherwise. So it is unbiased after all if shorter games are not counted.

P.S: My way of counting is different. Total_games is a counter which gets incremented after every game is played. Not the way you do it now.
So you do this for all shorter games
(legal_games / (total_games + 1)) * U(1) * U(2) ... U(13)
And this for all longer games
((legal_games + 1) / (total_games + 1)) * U(1) * U(2) ... U(13)
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Perft(13) betting pool

Post by Sven »

Daniel Shawul wrote:Hmm Didn't you say good night ? :)
I did but sitting in the train I could'nt resist to open the laptop again :-)
Daniel Shawul wrote:Anyway you also questioned the proof before suddenly accepting it as correct when it happens you got it right luckily :)
You are incredible! I questioned what you wrote, which was not the proof of "biasness" that you had intended to provide, so I did not change my mind. What did you read?

Sven
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Perft(13) betting pool

Post by Daniel Shawul »

Daniel Shawul wrote:
Daniel Shawul wrote:The reason why I wanted you to start from the root is because within the tree we are using _uniform_ sampling. One way to decrease the variance could be using upper confidence bound by saving the first 1-3 plies in memory. I mentioned the following formula a while back that is used for UCT, I am sure it can be modified to give much better results than any heuristic since this one learns it dynamically. But it requires complicated programming..

Code: Select all

X = Xmean + Constant * sqrt&#40; log&#40;parent_nodes&#41; / child_nodes&#41;)
Select the node with maximum X value. The second term which is the variance is meant for a mean score between 0 < X < 1 but I am sure it can be modified for our case. Anyway even using simple proportioning based on mean nodes count at the tree, this _dynamic_ adjustement should be better than any heuristic.
Here the idea for this kickass algorithm. You can say it is UCT for perft but I haven't actually implemented it so it can be all for nothing if I made some mistake in planning somewhere

Code: Select all

a&#41; Start from the root and do 10 Monte carlo simulations for each move
b&#41; Pick the best node to expand using UCB formula I provided above
c&#41; Expand the selected node which will be one of the biggest nodes with the right formula.
d&#41; Do this recursively for the children of that big node
So this algorithm explores the bigger nodes more often, while the smaller nodes will have their estimates from small number of simulations. No need for heuristic, no need for fixed depth expansion unlike the one we are doing i.e ply 3-6 fixed with uniform sampling.

If I have to write out my imagination, this algorithm will be so good as to be the best. But I could be wrong somewhere. Opinions ?
I have started working on this algorithm and here is preliminary result after only 100 visists of the root. It gives about 1.89e18

Code: Select all

&#91;st = 5557ms, mt = 29250ms , moves_left 10&#93;
Time &#58; 15ms Tree &#58; nodes 21 depth 0/0 pps 6666 visits 100
h2h3   1.228388e+017       1
h2h4   6.458556e+016       1
g2g3   3.674510e+016       1
g2g4   2.489331e+016       1
f2f3   3.163775e+016       1
f2f4   3.176670e+016       1
e2e3   2.354346e+017       2
e2e4   4.127740e+017      13
d2d3   1.893080e+017       3
d2d4   2.360869e+017      66
c2c3   4.930080e+016       1
c2c4   3.534796e+016       1
b2b3   1.144197e+017       1
b2b4   4.057798e+016       1
a2a3   8.509372e+016       1
a2a4   2.687664e+016       1
g1h3   6.901096e+016       1
g1f3   2.601490e+016       1
b1c3   4.535822e+016       1
b1a3   1.133245e+016       1
Perft 1.889404e+018
move d2d4
As you can see it selects either e2e4 or d2d4 often. Right now I have the selection criteria as the maximum perft node. I want to tune the selection criteria using UCB formula at the root first then I will let it expand heavier nodes..
Edit: Here is the result in 1 million simulations but it is bad right now about 1.96e18. The UCB forumula constant needs to be well tuned because the current one is not meant for perft counts.

Code: Select all

&#91;st = 5557ms, mt = 29250ms , moves_left 10&#93;
Time &#58; 47312ms Tree &#58; nodes 21 depth 0/0 pps 21136 visits 1000000
h2h3   4.806908e+016     697
h2h4   8.768004e+016    1070
g2g3   7.579254e+016     931
g2g4   6.568783e+016     833
f2f3   3.990278e+016     644
f2f4   5.959760e+016     782
e2e3   2.408709e+017  203535
e2e4   2.466839e+017  765874
d2d3   1.380035e+017    2213
d2d4   2.072176e+017   14017
c2c3   8.044813e+016     983
c2c4   9.402108e+016    1157
b2b3   6.875520e+016     861
b2b4   7.344205e+016     908
a2a3   5.014535e+016     711
a2a4   7.500670e+016     924
g1h3   6.644031e+016     840
g1f3   8.532869e+016    1041
b1c3   9.067604e+016    1110
b1a3   6.961431e+016     869
Perft 1.963384e+018
move e2e4