mcts question

Discussion of chess software programming and technical issues.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
jackd
Posts: 25
Joined: Mon Dec 10, 2018 1:45 pm
Full name: jack d.

mcts question

Post by jackd » Sat Jan 05, 2019 3:13 am

I recently tried to add mcts to my program. My eval sucks, so the alpha-beta searches I do during expansion suck, so I don't know if my mcts algorithm works or not. Anyway here are two subtleties I need to deal with:

A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.

B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.

matthewlai
Posts: 789
Joined: Sun Aug 03, 2014 2:48 am
Location: London, UK
Contact:

Re: mcts question

Post by matthewlai » Sat Jan 05, 2019 7:30 am

jackd wrote:
Sat Jan 05, 2019 3:13 am
A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.
In classical MCTS, all children are expanded (allocated) when the leaf is first visited. The parent (former leaf) node still has a visit count of 1 (the new nodes have visit count 0). Note that expanding does not "visit" the children. They start with visit count 0.
B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.
Classic way is to divide by (eg.) 1000 and apply tanh (hyperbolic tangent), so very high and very low values are "squished".
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.

jackd
Posts: 25
Joined: Mon Dec 10, 2018 1:45 pm
Full name: jack d.

Re: mcts question

Post by jackd » Sat Jan 05, 2019 11:47 am

matthewlai wrote:
Sat Jan 05, 2019 7:30 am
jackd wrote:
Sat Jan 05, 2019 3:13 am
A: After tree policy, I am at a leaf node (for my tree in memory) and I want to expand. I figured a good idea would be expanding all children in parallel, but when you expand more than one child visit counts become an inaccurate way to judge the exploration of a given node because leaves have different number of moves. My solution is to keep track of subtree size rather than visits.
In classical MCTS, all children are expanded (allocated) when the leaf is first visited. The parent (former leaf) node still has a visit count of 1 (the new nodes have visit count 0). Note that expanding does not "visit" the children. They start with visit count 0.
B: Normalizing cp. isn't very straightforward. My idea is to just divide by 1000.
Classic way is to divide by (eg.) 1000 and apply tanh (hyperbolic tangent), so very high and very low values are "squished".
That clarifies a lot. Thanks.

Post Reply