Neural network evaluation for a Monte Carlo Tree Search engine

clayt · Post by **clayt** » Fri Jul 29, 2022 9:50 pm

Another approach to try for move selection: perform a Bayesian analysis of the expected outcome subject to a confidence bound. We can go about this by assuming a prior distribution on the outcome of rollouts in the system. With some data, you could construct a good prior, but for now we can assume it's uniform.

We can set an arbitrary confidence p (most likely equal to 0.95 or 0.99 in our case). Then, we perform Bayesian inference to find, for each move, the lower bound such that P(probability of rollout success | observed outcome of rollouts) >= p. We can then pick the one with highest lower bound.

Full warning, I have no experimental data to show that this is an improvement over the usual method. However, it might be interesting to try and see if you get any good results.

RedBedHed · Post by **RedBedHed** » Sat Jul 30, 2022 11:50 pm

clayt wrote: ↑Fri Jul 29, 2022 9:50 pm Another approach to try for move selection: perform a Bayesian analysis of the expected outcome subject to a confidence bound. We can go about this by assuming a prior distribution on the outcome of rollouts in the system. With some data, you could construct a good prior, but for now we can assume it's uniform.

We can set an arbitrary confidence p (most likely equal to 0.95 or 0.99 in our case). Then, we perform Bayesian inference to find, for each move, the lower bound such that P(probability of rollout success | observed outcome of rollouts) >= p. We can then pick the one with highest lower bound.

Full warning, I have no experimental data to show that this is an improvement over the usual method. However, it might be interesting to try and see if you get any good results.

Very interesting! I haven't taken any formal classes on this kind of analysis, but I am looking into it

RedBedHed · Post by **RedBedHed** » Sat Jul 30, 2022 11:52 pm

Update:

With a new garbage collection strategy-- one that doesn't involve searching the full tree every time the memory is full-- The search now allocs and evaluates about 1,500,000 nodes per second.

OWL · Post by **OWL** » Sun Jul 31, 2022 10:12 pm

Very nice to hear you are looking into MCTS. How are you storing the game tree? In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.

dangi12012 · Post by **dangi12012** » Sun Jul 31, 2022 11:23 pm

OWL wrote: ↑Sun Jul 31, 2022 10:12 pm Very nice to hear you are looking into MCTS. How are you storing the game tree? In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.

What do you do once your memory limit is reached?
- Stop searching?
- Prune the tree? (then you would have to do defragmentation on your continuous memory)
- Stope growing the tree?

How did you solve multithreading?

OWL · Post by **OWL** » Mon Aug 01, 2022 10:43 am

When the memory limit is searched I stop the search. But I think one can also stop expanding an keep doing simulations.

Multithreading with leaf and root parallelization is still possible. For tree parallelization I have tackled expansion and rollouts. But have not found anything better than using locks for selection and backpropagation.

AAce3 · Post by **AAce3** » Mon Aug 01, 2022 5:54 pm

I believe that Leela also has a parameter (called "Futile Search Aversion") which doesn't visit the nodes that can't become the best move given the time alotted. You could consider doing that as well.

As far as I know NPS is not a huge deal for MCTS engines. There's a big tradeoff between node quality and node quantity. I believe Komodo's MCTS implementation only gets 30 nps per core (less than 3000 nps on TCEC hardware!) because it uses alpha beta searches to expand. But, it's one of the most competitive MCTS based engines. If you want to do traditional rollouts, I'd consider doing multiple per expansion.

OWL wrote: ↑Sun Jul 31, 2022 10:12 pm In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.

Perhaps restricting the memory to a fixed bound is a good thing - most engines are required to limit hash.

OWL wrote: ↑Mon Aug 01, 2022 10:43 am When the memory limit is searched I stop the search. But I think one can also stop expanding an keep doing simulations.

Multithreading with leaf and root parallelization is still possible. For tree parallelization I have tackled expansion and rollouts. But have not found anything better than using locks for selection and backpropagation.

I don't believe there's any way to safely traverse an MCTS tree multithreaded without locks. Even Leela uses Mutexes. Most of the time is spent in rollouts and, in Leela's case, determining policy priors. Lots of idle time for you to unlock the data and give it to another thread

RedBedHed · Post by **RedBedHed** » Tue Aug 02, 2022 3:51 am

AAce3 wrote: ↑Mon Aug 01, 2022 5:54 pm I believe that Leela also has a parameter (called "Futile Search Aversion") which doesn't visit the nodes that can't become the best move given the time alotted. You could consider doing that as well.

As far as I know NPS is not a huge deal for MCTS engines. There's a big tradeoff between node quality and node quantity. I believe Komodo's MCTS implementation only gets 30 nps per core (less than 3000 nps on TCEC hardware!) because it uses alpha beta searches to expand. But, it's one of the most competitive MCTS based engines. If you want to do traditional rollouts, I'd consider doing multiple per expansion.

OWL wrote: ↑Sun Jul 31, 2022 10:12 pm In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.
Perhaps restricting the memory to a fixed bound is a good thing - most engines are required to limit hash.

OWL wrote: ↑Mon Aug 01, 2022 10:43 am When the memory limit is searched I stop the search. But I think one can also stop expanding an keep doing simulations.

Multithreading with leaf and root parallelization is still possible. For tree parallelization I have tackled expansion and rollouts. But have not found anything better than using locks for selection and backpropagation.
I don't believe there's any way to safely traverse an MCTS tree multithreaded without locks. Even Leela uses Mutexes. Most of the time is spent in rollouts and, in Leela's case, determining policy priors. Lots of idle time for you to unlock the data and give it to another thread

Even with many rollouts per expansion, I found that the traditional rollout-based approach was much too time-consuming to search deep into the tree and too unreliable at a shallow depth. You also have to cut off the move cycles and figure out some way to evaluate them (my first instinct was to evaluate them as a draw, but I don't think that accurately reflects how promising the tree below the leaf is). A default policy of rolling out random lines of play makes sense in Go, as the only way to statically evaluate the board is to train a neural net. But I don't think it is desirable in a game like Chess, where an engine can evaluate the board reasonably well at any given time.

MCAB seems like a fascinating algorithm. I've already decided on my approach for my honors thesis, but I'd love to try MCAB someday when I have time.

AAce3 · Post by **AAce3** » Tue Aug 02, 2022 5:08 am

RedBedHed wrote: ↑Tue Aug 02, 2022 3:51 am
AAce3 wrote: ↑Mon Aug 01, 2022 5:54 pm I believe that Leela also has a parameter (called "Futile Search Aversion") which doesn't visit the nodes that can't become the best move given the time alotted. You could consider doing that as well.

As far as I know NPS is not a huge deal for MCTS engines. There's a big tradeoff between node quality and node quantity. I believe Komodo's MCTS implementation only gets 30 nps per core (less than 3000 nps on TCEC hardware!) because it uses alpha beta searches to expand. But, it's one of the most competitive MCTS based engines. If you want to do traditional rollouts, I'd consider doing multiple per expansion.

OWL wrote: ↑Sun Jul 31, 2022 10:12 pm In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.
Perhaps restricting the memory to a fixed bound is a good thing - most engines are required to limit hash.

OWL wrote: ↑Mon Aug 01, 2022 10:43 am When the memory limit is searched I stop the search. But I think one can also stop expanding an keep doing simulations.

Multithreading with leaf and root parallelization is still possible. For tree parallelization I have tackled expansion and rollouts. But have not found anything better than using locks for selection and backpropagation.
I don't believe there's any way to safely traverse an MCTS tree multithreaded without locks. Even Leela uses Mutexes. Most of the time is spent in rollouts and, in Leela's case, determining policy priors. Lots of idle time for you to unlock the data and give it to another thread
Even with many rollouts per expansion, I found that the traditional rollout-based approach was much too time-consuming to search deep into the tree and too unreliable at a shallow depth. You also have to cut off the move cycles and figure out some way to evaluate them (my first instinct was to evaluate them as a draw, but I don't think that accurately reflects how promising the tree below the leaf is). A default policy of rolling out random lines of play makes sense in Go, as the only way to statically evaluate the board is to train a neural net. But I don't think it is desirable in a game like Chess, where an engine can evaluate the board reasonably well at any given time.

MCAB seems like a fascinating algorithm. I've already decided on my approach for my honors thesis, but I'd love to try MCAB someday when I have time.

Yeah, I definitely don't think that rollouts are the way to go for chess. Even if you were able to do a lot of them they're still much too noisy, and much too slow.

Out of curiosity, what is the approach that you plan on taking? And, of course, best of luck with your honors thesis!

AAce3 · Post by **AAce3** » Tue Aug 02, 2022 5:19 am

OWL wrote: ↑Sun Jul 31, 2022 10:12 pm Very nice to hear you are looking into MCTS. How are you storing the game tree? In my implementation I store the nodes in a (large) array: Node GameTree[node_limit].
Then each node stores the index of its parent and the index of the first child node and how many children, these children are then stored contiguously so I can easily loop over all child nodes.
A downside is that there is a strict limit on the amount of nodes or unused memory.

I haven't tested it, so do take this with a grain of salt - but it instinctually feels like that storing children contiguously doesn't seem like a good idea - you need to find large-enough "gaps" in your array which might result in wasted memory and slow allocations.

Another idea would be to have each node in the array contain an index of its rightmost "sibling," as well as its' leftmost children. almost like a linked-list. That way you could follow the pointers to find all of the children of its node. Most of the time when you allocate the nodes it would be fairly close, I'd imagine, so it doesn't have that much extra overhead while searching, but allows for faster allocations and fewer unused slots in the table.

I'd imagine this approach would also work for edges. Again, I haven't tested this, so do take this with a grain of salt.

Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine

Re: Neural network evaluation for a Monte Carlo Tree Search engine