Treasure Trove

dangi12012 · Post by **dangi12012** » Mon Feb 28, 2022 11:45 pm

Sometimes I wonder - what will be the future for chessprogramming. In my opinion what is used today is the status quo which will not be the same in the future (duh) but its hard to see beyond the common wisdom of alphabeta with the list of these improvements: + NNUE https://stackoverflow.com/questions/165 ... 4#16642804
Above is the status quo. Below is speculation

In my testing the most used PURE C++ AB algorithm (no chess logic single thread - eval is a random number) can do 300MNps. With templates and some hardcore c++ knowledge that can be improved to 1200MNps. This would be the upper limit of performance. Any line of code added makes it slower also no amount of extra threads really help that much.

Here is the repo that contains many game agnostic algos for learning: https://github.com/psaikko/mcts-reversi

My testing for MCTS yields no such hard limit. Algorithmic improvements have made my MCTS walker go beyond 8000MNps on a single thread. (Not chess position but random positions of a game with branching factor 0-64. UCT1 can be done extremely efficiently)

I think it might be https://www.chessprogramming.org/MC%CE%B1%CE%B2
MCTS suffers from not seeing forces lines. But it might be good to have alphabeta at the leaf nodes - or a ligthweight engine that can do noisy playouts with something like 1500 elo. So if noisy 1500 elo player can win 1000/1000 times against itself its *very probably* a good position.

All that zobrist hashing in AB is just the inverse of what you are really doing. Evaluating nodes of the tree - but historically memorywise it was forced to be depth first. I think that is not the case anymore - mid term the top first algos might win.

I am aware of Lc0 etc- the point I am asking is:
Where do you think the tree search algos are heading? Will it be MCTS with a policy network - or will AB with NNUE extensions stay competitive and why?

Daniel Shawul · Post by **Daniel Shawul** » Tue Mar 01, 2022 5:08 am

I think hybrid is the best.
You can use big NN with MCTS and use the AB search to weed out tactical mistakes.
I actually do that in my engine and get about +30 elo from using an AB helper.
Moreover, the algorithm uses both the CPU and GPU fully, which is not the case if you used only an MCTS search engine.
I've tried several combinations of MCTS+AB but in the end, the simplest method is the one that worked for me the best because it does
not hinder the performance of both AB and MCTS engines a lot.

smatovic · Post by **smatovic** » Tue Mar 01, 2022 7:03 pm

https://eta-chess.app26.de/post/i-see-/

I see three ways which neural networks for chess may take in future...

1. With more processing power available, the network size will raise, and we will have really big nets on one side of the extreme, which drop the search algorithm part and perform only a depth 1 search for evaluation.

2. With neural network accelerators with less latency, we will see engines with multiple, smaller neural networks, which perform deeper AlphaBeta searches on the other side.

3. Something in beetween 1. and 2.

If I had the resources, I would go for an ply 1 NN Behemoth;

https://talkchess.com/forum3/viewtopic.php?f=2&t=74915

--
Srdja

smatovic · Post by **smatovic** » Thu Mar 03, 2022 6:58 am

dangi12012 wrote: ↑Mon Feb 28, 2022 11:45 pm [...]
I think it might be https://www.chessprogramming.org/MC%CE%B1%CE%B2
MCTS suffers from not seeing forces lines. But it might be good to have alphabeta at the leaf nodes - or a ligthweight engine that can do noisy playouts with something like 1500 elo. So if noisy 1500 elo player can win 1000/1000 times against itself its *very probably* a good position.
[...]

as a side note:

Marco Costalba proposed MCTS for GPU in 2008:

http://www.talkchess.com/forum3/viewtopic.php?t=22732

AFAIK no one implemented and posted results for an massive parallel MCTS (+UCT) chess playing engine on GPU.

There might be three different approaches for this:

- vanilla MCTS-UCT (brute-force, go for speed)
- MCTS-AB (enhance playout phase)
- MCTS-PUCT (enhance selection phase)

--
Srdja

dangi12012 · Post by **dangi12012** » Thu Mar 03, 2022 5:22 pm

smatovic wrote: ↑Thu Mar 03, 2022 6:58 am
dangi12012 wrote: ↑Mon Feb 28, 2022 11:45 pm [...]
I think it might be https://www.chessprogramming.org/MC%CE%B1%CE%B2
MCTS suffers from not seeing forces lines. But it might be good to have alphabeta at the leaf nodes - or a ligthweight engine that can do noisy playouts with something like 1500 elo. So if noisy 1500 elo player can win 1000/1000 times against itself its *very probably* a good position.
[...]
as a side note:

Marco Costalba proposed MCTS for GPU in 2008:

http://www.talkchess.com/forum3/viewtopic.php?t=22732

AFAIK no one implemented and posted results for an massive parallel MCTS (+UCT) chess playing engine on GPU.

There might be three different approaches for this:

- vanilla MCTS-UCT (brute-force, go for speed)
- MCTS-AB (enhance playout phase)
- MCTS-PUCT (enhance selection phase)

--
Srdja

That was my intiution as well. But the "todo" issue is what to do at leaf nodes because the pure random playouts are susceptible against missing forced lines.
The leaf nodes could be a noisy playout algorithm that does a playout with a high ELO (2000+)
or a good NN that can see a few plys deeper without suprises. (Also 2000+ elo)

I think its very obvious that the gpu will play a much bigger role in computerchess in the future. gpu has an order of magnitude better performance than x64 + DDR5 with faster memory - AND is just as versatile as c++ with vulkan - also its more scalable since you can have many gpus per system. also vk runs in the web while c++ does not natively.

The only thing that I can say definitely is that AB is definitely not suited to massive parallelisation.
IMO MCTS + Neuronal networks on the GPU will probably emerge as the stronger winner in the forseeable future.

Its already the case that the comparisons are not fair. It should be ELO/Joule or ELO/1000Dollars of hardware.
Since I am seeing 32 core Threadrippers vs a RTX 2060 on some comparison sites which is not fair. (SF vs LC0)

IMO the fair comparison is either money vs money or energy vs energy. But the problem of a fair comparison with different hardware is also there.

My point is AB seems to be algorithmically forced to be as little as parallel as possible - and cannot remember previous results (only with TT).
While MCTS just builds a better and better tree - and if a move is made just removes these leafes so parallelisation inside a system and even among a cluster of systems is easily done when there is memory mapped IO available (which it is)

smatovic · Post by **smatovic** » Sat Mar 05, 2022 7:32 am

Hmm, I guess at some point you have to start to experiment and produce some real world data.

If we assume tactical forced-lines in chess, how much brute-force is in need to resolve these, if it can be done?

What kind of playout with what kind of eval is needed to enhance plain MCTS-UCT? NPS-evaluation trade-off.

What kind of MCTS-PUCT can pay off? Again NPS-evaluation trade-off.

You wrote before that the real horse-power of GPUs lie in the TensorCores, so you may want to use them for an CNN evaluation, which net-size gives the best results in which search framework, etc.

--
Srdja

j.t. · Post by **j.t.** » Sun Mar 06, 2022 1:52 pm

dangi12012 wrote: ↑Thu Mar 03, 2022 5:22 pm My point is AB seems to be algorithmically forced to be as little as parallel as possible - and cannot remember previous results (only with TT).
While MCTS just builds a better and better tree - and if a move is made just removes these leaves, so parallelisation inside a system and even among a cluster of systems is easily done when there is memory mapped IO available (which it is)

I am not 100% convinced that MCTS solves the issue of parallelisation of searching a game tree. In parallel alpha-beta the issue is that at some number of available threads, the new threads only can be used to explore lines that wouldn't need searching anyway because of the structure of the tree. Couldn't this also be a limiting factor for any kind of tree search in this context of chess, i.e. also for MCTS?

Daniel Shawul · Post by **Daniel Shawul** » Sun Mar 06, 2022 4:30 pm

j.t. wrote: ↑Sun Mar 06, 2022 1:52 pm
dangi12012 wrote: ↑Thu Mar 03, 2022 5:22 pm My point is AB seems to be algorithmically forced to be as little as parallel as possible - and cannot remember previous results (only with TT).
While MCTS just builds a better and better tree - and if a move is made just removes these leaves, so parallelisation inside a system and even among a cluster of systems is easily done when there is memory mapped IO available (which it is)
I am not 100% convinced that MCTS solves the issue of parallelisation of searching a game tree. In parallel alpha-beta the issue is that at some number of available threads, the new threads only can be used to explore lines that wouldn't need searching anyway because of the structure of the tree. Couldn't this also be a limiting factor for any kind of tree search in this context of chess, i.e. also for MCTS?

That is correct. In any kind of parallel search you are going to encounter search overhead even with MCTS. One doesn't have a strictly equivalent form of sequential alpha-beta search. There is ABDADA/lazySMP that is more parallelizable version of AlphaBeta. In the extreme case, one can even use Rollouts version of AB that is identical to MCTS, i.e. one thread searching exactly one leaf node at a time.

The scalability of MCTS suffers with many threads and a shared tree, which uses the virtual loss trick. So you will suffer from too many collisions in this case. With the cluster version of MCTS using different random number seeds, there is just a lot of search overhead.

dangi12012 · Post by **dangi12012** » Sun Mar 06, 2022 5:50 pm

Daniel Shawul wrote: ↑Sun Mar 06, 2022 4:30 pm The scalability of MCTS suffers with many threads and a shared tree, which uses the virtual loss trick. So you will suffer from too many collisions in this case. With the cluster version of MCTS using different random number seeds, there is just a lot of search overhead.

Both systems suffer - but the algorithmic difference remains. AB seems to be capped at 1200MNPS and more cores do not help on current HW. (They do help fill the Hashtable but the algorithm is really forced to be as efficient as possible on the leftmost node of one thread)

MCTS has no such limitation - the tree is built until you run out of memory. Which is the bigger problem since you remeber every node you visit and you are blind to forced lines with a naive implementation.

I can think of hybrid systems where you pick the best of both worlds.
It will remain an unsolvable problem - but what my gutfeeling is:

Every technology or algorithm is an S-Curve where the advances top out until a competitor crosses over. AB is very far in the advanced line with a list of around 20 improvements that help. I can see around every 5th thread in this forum is about finding one more thing that helps AB to go a step further.
MCTS has not crossed over AB but it seems a lot of fundamental problems are of a different nature and that the rate of improvement there will outpace AB for chess in this decade maybe.

For Machine Learning the common wisdom also was that it might not help or maybe only a little until someone took the time to implement it. (It was not even the chess people that did the bulk of the work - since that was research for GO). Machine learning was a well understood tool since the last 15 years or so but only implemented in the last few.

But that is the question of this thread: What algorithm lies behind the common wisdom and status quo that is known now. It might to be some obscure algorithm or fact that seems insignificant now but if implemented for chess is transofmative.

So please speculate if you have any ideas what could improve computerchess strenght.

Sopel · Post by **Sopel** » Sun Mar 06, 2022 10:57 pm

This is about as useful as trying to guess the future based on how fast a `for(;;) i+=1;` can go. Your numbers are unrealistic and quickly lose meaning when combined with proper search and eval.

Treasure Trove

Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove

Re: Treasure Trove