Alphabeta on top of Engine cluster

dangi12012 · Post by **dangi12012** » Sun Dec 19, 2021 9:02 pm

Oh wow even with the perfect pseudocode!

That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"

I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html

connor_mcmonigle · Post by **connor_mcmonigle** » Mon Dec 20, 2021 3:49 am

dangi12012 wrote: ↑Sun Dec 19, 2021 9:02 pm Oh wow even with the perfect pseudocode!

That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"

I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html

As others have said, minimax with alpha-beta pruning is inherently sequential (this is trivial to see if you understand how alpha beta pruning works (hint: any move can yield a beta-cutoff)). However, this doesn't mean it's impossible to parallelize search. At one (stupid) extreme you could just perform a minimax search sans alpha-beta pruning which would be simple to parallelize, though obviously yield awful results. If you're willing to occasionally waste time searching useless branches in parallel, then of course you can parallelize the search pretty easily using something like ABDADA, DTS or YBWC, but there's no way to get around this wastefulness so you'll necessarily see diminishing returns with greater concurrency.

dangi12012 · Post by **dangi12012** » Mon Dec 20, 2021 1:17 pm

connor_mcmonigle wrote: ↑Mon Dec 20, 2021 3:49 am
dangi12012 wrote: ↑Sun Dec 19, 2021 9:02 pm Oh wow even with the perfect pseudocode!

That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"

I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html
As others have said, minimax with alpha-beta pruning is inherently sequential (this is trivial to see if you understand how alpha beta pruning works (hint: any move can yield a beta-cutoff)). However, this doesn't mean it's impossible to parallelize search. At one (stupid) extreme you could just perform a minimax search sans alpha-beta pruning which would be simple to parallelize, though obviously yield awful results. If you're willing to occasionally waste time searching useless branches in parallel, then of course you can parallelize the search pretty easily using something like ABDADA, DTS or YBWC, but there's no way to get around this wastefulness so you'll necessarily see diminishing returns with greater concurrency.

Yes everyone agrees on that including me. I am happy with the provided information.
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.

I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.

connor_mcmonigle · Post by **connor_mcmonigle** » Mon Dec 20, 2021 5:55 pm

dangi12012 wrote: ↑Mon Dec 20, 2021 1:17 pm ...

Yes everyone agrees on that including me. I am happy with the provided information.
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.

I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.

Actually, no top minimax+alpha-beta pruning engines rely on policy networks, though there is definitely potential there imho. Existing move ordering heuristics work remarkably well - better than you're expecting most likely. In fact, many top engines have average branching factors less than 2! It should be evident that it is, consequently, quite rare that move 4 is actually the best move (according to the engine's search) in a modern chess engine. This might be surprising as, even with alpha-beta pruning and optimal move ordering, when proving a fail low, all moves still have to be searched. However, modern engines employ numerous heuristics such as late move reductions, late move pruning, futility pruning, SEE pruning etc. which insure the ABF remains low even when considering fail lows.

Therefore, LazySMP (SHT) almost always yields better Elo scaling in modern engines as compared to ABDADA/YWBC.

dangi12012 · Post by **dangi12012** » Mon Dec 20, 2021 9:23 pm

connor_mcmonigle wrote: ↑Mon Dec 20, 2021 5:55 pm
dangi12012 wrote: ↑Mon Dec 20, 2021 1:17 pm ...

Yes everyone agrees on that including me. I am happy with the provided information.
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.

I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.
Actually, no top minimax+alpha-beta pruning engines rely on policy networks, though there is definitely potential there imho. Existing move ordering heuristics work remarkably well - better than you're expecting most likely. In fact, many top engines have average branching factors less than 2! It should be evident that it is, consequently, quite rare that move 4 is actually the best move (according to the engine's search) in a modern chess engine. This might be surprising as, even with alpha-beta pruning and optimal move ordering, when proving a fail low, all moves still have to be searched. However, modern engines employ numerous heuristics such as late move reductions, late move pruning, futility pruning, SEE pruning etc. which insure the ABF remains low even when considering fail lows.

Therefore, LazySMP (SHT) almost always yields better Elo scaling in modern engines as compared to ABDADA/YWBC.

Insightful comment! Isnt LazySMP heavily reliant on a shared hashtable? So the only way to do that would be to either allow a uci command to point uci engines to an existing memory mapped memory region (that could still be over the network since the OS abstracts these details away).
So LazySMP wouldnt be possible with a "black box" approach or only after modifying sourcecode?

I was looking for a distributed black box algorithm that can work with finished compiled engines and builds the movetree with evaluation itself (in parallel). So that contributing engines may even be hosted in the cloud.

connor_mcmonigle · Post by **connor_mcmonigle** » Mon Dec 20, 2021 9:54 pm

dangi12012 wrote: ↑Mon Dec 20, 2021 9:23 pm ...

Insightful comment! Isnt LazySMP heavily reliant on a shared hashtable? So the only way to do that would be to either allow a uci command to point uci engines to an existing memory mapped memory region (that could still be over the network since the OS abstracts these details away).
So LazySMP wouldnt be possible with a "black box" approach or only after modifying sourcecode?

I was looking for a distributed black box algorithm that can work with finished compiled engines and builds the movetree with evaluation itself (in parallel). So that contributing engines may even be hosted in the cloud.

Correct, LazySMP doesn't really work in the context of your original premise. There's a (now outdated afaik) cluster fork of Stockfish which used LazySMP for parallelism, but it obviously made numerous modifications to the source. As it isn't possible to parallelize a minimax with alpha-beta pruning search, better results could probably be obtained with something like a MCTS/UCT search which enables batching of calls to the evaluation function (where the "evaluation function" could be the result of a UCI engine performing a depth N search on the position). This has already been done successfully here: https://www.chessdb.cn/queryc_en/.

R. Tomasi · Post by **R. Tomasi** » Tue Dec 21, 2021 12:12 pm

I somehow think that such a hashtable that is shared over the network will kill any advantages of lazy SMP because of network latency. My understanding is that lazy SMP works so well because you get TT hits from positions that have been explored by other threads (not only for eval, but mor importantly for move ordering). My gut feeling (yes, pure speculation) is that at least the move ordering via TT has so high latency that it voids any benefits of lazy SMP.

Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster

Re: Alphabeta on top of Engine cluster