MCTS: A leaf node is selected for evaluation. Then this result gets backtracked towards the tree root. All nodes can be updated atomically without locks.
Now when there are multiple threads working on the tree, and they are picking nodes around the same time - they will all end up at the same node, since nothing else is updated yet. 
This seems like a complete waste of resources - ideally MCTS should scale to 100s of threads without any inter-threat communication but still picking different nodes. 
If two sibling nodes are close enough should randomness decide which path to take towards the MCTS leaf?
Any devs here that thought about this already?
			
			
									
						
							Lockfree parallel MCTS - how?
Moderator: Ras
- 
				dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Lockfree parallel MCTS - how?
Worlds-fastest-Bitboard-Chess-Movegenerator 
Daniel Inführ - Software Developer
			
						Daniel Inführ - Software Developer
- 
				smatovic
- Posts: 3358
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Lockfree parallel MCTS - how?
If you use MCTS-UCT with UCB1 this will normalize after a certain amount of playouts, and you usually do not need locks for this, dunno about MCTS-PUCT in Lc0/A0.
--
Srdja
			
			
									
						
										
						--
Srdja
- 
				smatovic
- Posts: 3358
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Lockfree parallel MCTS - how?
https://www.chessprogramming.org/Monte- ... ree_SearchMCTS consists of four strategic phases, repeated as long as there is time left [13] :
In the Selection phase the tree is traversed from the root node until it selects a leaf node that is not added to the tree yet
The Expansion strategy adds the leaf node to the tree
The Simulation strategy plays moves in self-play until the end of the game. The result is either 1, 0 ,-1
The Backpropagation strategy propagates the results through the tree
If you mean the expansion phase of a leaf node, to add the node to the game tree in memory, then I guess you really need locks for this. In my parallel BestFirst implementation I simply restart the selection phase if a thread encounters a lock for expansion.
In one of my MCTS-UCT implementations I did limit the expansion phase, can not recall what the papers say about this, let's say you perform a 1000 MC playouts on a leaf, and then the node gets expanded and added to the tree, or alike, to limit the memory usage.
--
Srdja
- 
				dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Lockfree parallel MCTS - how?
I mean MCTS with many threads when a node is getting expanded. If other threads select a node via ucb1 around the same time they will all end up at the same node since for each thread no tree value changed in the meantime.smatovic wrote: ↑Sat Oct 08, 2022 9:46 am If you mean the expansion phase of a leaf node, to add the node to the game tree in memory, then I guess you really need locks for this. In my parallel BestFirst implementation I simply restart the selection phase if a thread encounters a lock for expansion.
If the implementation is like yours - 1 thread will expand and 63 threads will reach the same node then (via ucb1) and restart until this is done?
Worlds-fastest-Bitboard-Chess-Movegenerator 
Daniel Inführ - Software Developer
			
						Daniel Inführ - Software Developer
- 
				smatovic
- Posts: 3358
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Lockfree parallel MCTS - how?
You can propagate in this case the lock flag back in the tree, so other threads do not select this path, but if you increase the visit counters for UCB1 the path to select will also change over time.dangi12012 wrote: ↑Sat Oct 08, 2022 12:16 pmI mean MCTS with many threads when a node is getting expanded. If other threads select a node via ucb1 around the same time they will all end up at the same node since for each thread no tree value changed in the meantime.smatovic wrote: ↑Sat Oct 08, 2022 9:46 am If you mean the expansion phase of a leaf node, to add the node to the game tree in memory, then I guess you really need locks for this. In my parallel BestFirst implementation I simply restart the selection phase if a thread encounters a lock for expansion.
Yes, they may reach the same node and then restart the selection phase. There are several ways to enhance UCB1 with more parameters, as you suggested, randomness is one of those.dangi12012 wrote: ↑Sat Oct 08, 2022 12:16 pm If the implementation is like yours - 1 thread will expand and 63 threads will reach the same node then (via ucb1) and restart until this is done?
--
Srdja
- 
				dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Lockfree parallel MCTS - how?
Ah I see - then I see a solution now. 
Each leaf node has a std::atomic<bool> which gets set on entering the expansion or eval phase. This is not a lock but more like a "do not enter" sign.
When a thread encounters a leaf and the bool is false - it will just try the next sibling or backtrack to the parent recursively to get the next node according to ucb1.
This way each thread can do meaningful work and is not reset to root, neither are nodes visited unnecessarily nor is ucb1 calculated too often.
			
			
									
						
							Each leaf node has a std::atomic<bool> which gets set on entering the expansion or eval phase. This is not a lock but more like a "do not enter" sign.
When a thread encounters a leaf and the bool is false - it will just try the next sibling or backtrack to the parent recursively to get the next node according to ucb1.
This way each thread can do meaningful work and is not reset to root, neither are nodes visited unnecessarily nor is ucb1 calculated too often.
Worlds-fastest-Bitboard-Chess-Movegenerator 
Daniel Inführ - Software Developer
			
						Daniel Inführ - Software Developer
- 
				smatovic
- Posts: 3358
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Lockfree parallel MCTS - how?
If you propagate the lock flag (IIRC I used a int with thread id+1) back in the tree you can update the score tree with the locked node excluded, I did this with BestFirst with game tree in memory, dunno if this is feasible with UCB1, to select simply the next sibling might be quicker, but from the score tree point of view the wrong decision, if the node was expaned and added to the tree, you can free the lock and reupdate the score tree again.
--
Srdja
			
			
									
						
										
						--
Srdja
- 
				smatovic
- Posts: 3358
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Lockfree parallel MCTS - how?
Okay, maybe to summarize what I did with BestFirst and MCTS-UCT with game tree in memory in different implementations in case of an locked expansion node:
- you can modify UCB1 with more parameters, randomness, to vary selection
- you can just increment UCB1 visit counters and restart selection phase if a thread encounters a lock for expansion
- you can back-propagate the lock through the tree so other threads select a different path
- you can update the score tree back in the tree with locked node excluded so other threads select a different path
I am not aware of any lock-less method in case of an expansion node, and idk how Lc0/A0 handles this with MCTS-PUCT.
--
Srdja
			
			
									
						
										
						- you can modify UCB1 with more parameters, randomness, to vary selection
- you can just increment UCB1 visit counters and restart selection phase if a thread encounters a lock for expansion
- you can back-propagate the lock through the tree so other threads select a different path
- you can update the score tree back in the tree with locked node excluded so other threads select a different path
I am not aware of any lock-less method in case of an expansion node, and idk how Lc0/A0 handles this with MCTS-PUCT.
--
Srdja
- 
				AAce3  
- Posts: 80
- Joined: Fri Jul 29, 2022 1:30 am
- Full name: Aaron Li
Re: Lockfree parallel MCTS - how?
I don't know if my interpretation of the code is correct, however, I believe that Lc0 uses locks. The main thing is that the neural network evaluations take so much time that most of the time, when expanding a node, you're going to be sending stuff to GPU, waiting for it to return something, rather than traversing the tree, which takes a short time comparatively.
Feel free to correct me if I'm wrong though! I don't use C++ so my guess may be completely wrong.
- 
				Daniel Shawul
- Posts: 4186
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Lockfree parallel MCTS - how?
Scorpio mcts is lockless. I use atomic operations and try_lock in places where you usually need locks.
This is important especially because i use upto 640 threads for mcts search, while leela typically launches 2 or 4 threads.
			
			
									
						
										
						This is important especially because i use upto 640 threads for mcts search, while leela typically launches 2 or 4 threads.