Oh wow even with the perfect pseudocode!
That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"
I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html
Alphabeta on top of Engine cluster
Moderator: Ras
-
dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Alphabeta on top of Engine cluster
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
connor_mcmonigle
- Posts: 544
- Joined: Sun Sep 06, 2020 4:40 am
- Full name: Connor McMonigle
Re: Alphabeta on top of Engine cluster
As others have said, minimax with alpha-beta pruning is inherently sequential (this is trivial to see if you understand how alpha beta pruning works (hint: any move can yield a beta-cutoff)). However, this doesn't mean it's impossible to parallelize search. At one (stupid) extreme you could just perform a minimax search sans alpha-beta pruning which would be simple to parallelize, though obviously yield awful results. If you're willing to occasionally waste time searching useless branches in parallel, then of course you can parallelize the search pretty easily using something like ABDADA, DTS or YBWC, but there's no way to get around this wastefulness so you'll necessarily see diminishing returns with greater concurrency.dangi12012 wrote: ↑Sun Dec 19, 2021 9:02 pm Oh wow even with the perfect pseudocode!
That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"
I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html
-
dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Alphabeta on top of Engine cluster
Yes everyone agrees on that including me. I am happy with the provided information.connor_mcmonigle wrote: ↑Mon Dec 20, 2021 3:49 amAs others have said, minimax with alpha-beta pruning is inherently sequential (this is trivial to see if you understand how alpha beta pruning works (hint: any move can yield a beta-cutoff)). However, this doesn't mean it's impossible to parallelize search. At one (stupid) extreme you could just perform a minimax search sans alpha-beta pruning which would be simple to parallelize, though obviously yield awful results. If you're willing to occasionally waste time searching useless branches in parallel, then of course you can parallelize the search pretty easily using something like ABDADA, DTS or YBWC, but there's no way to get around this wastefulness so you'll necessarily see diminishing returns with greater concurrency.dangi12012 wrote: ↑Sun Dec 19, 2021 9:02 pm Oh wow even with the perfect pseudocode!
That was the perfect link and multiple people before said "It cannot be done because...."
And pseudocode would be much to difficult. "Its impossible"
I am laughing now - as usual the doubters were totally and categorically wrong.
http://www.tckerrigan.com/Chess/Paralle ... bdada.html
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.
I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
connor_mcmonigle
- Posts: 544
- Joined: Sun Sep 06, 2020 4:40 am
- Full name: Connor McMonigle
Re: Alphabeta on top of Engine cluster
Actually, no top minimax+alpha-beta pruning engines rely on policy networks, though there is definitely potential there imho. Existing move ordering heuristics work remarkably well - better than you're expecting most likely. In fact, many top engines have average branching factors less than 2! It should be evident that it is, consequently, quite rare that move 4 is actually the best move (according to the engine's search) in a modern chess engine. This might be surprising as, even with alpha-beta pruning and optimal move ordering, when proving a fail low, all moves still have to be searched. However, modern engines employ numerous heuristics such as late move reductions, late move pruning, futility pruning, SEE pruning etc. which insure the ABF remains low even when considering fail lows.dangi12012 wrote: ↑Mon Dec 20, 2021 1:17 pm ...
Yes everyone agrees on that including me. I am happy with the provided information.
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.
I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.
Therefore, LazySMP (SHT) almost always yields better Elo scaling in modern engines as compared to ABDADA/YWBC.
-
dangi12012
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Alphabeta on top of Engine cluster
Insightful comment! Isnt LazySMP heavily reliant on a shared hashtable? So the only way to do that would be to either allow a uci command to point uci engines to an existing memory mapped memory region (that could still be over the network since the OS abstracts these details away).connor_mcmonigle wrote: ↑Mon Dec 20, 2021 5:55 pmActually, no top minimax+alpha-beta pruning engines rely on policy networks, though there is definitely potential there imho. Existing move ordering heuristics work remarkably well - better than you're expecting most likely. In fact, many top engines have average branching factors less than 2! It should be evident that it is, consequently, quite rare that move 4 is actually the best move (according to the engine's search) in a modern chess engine. This might be surprising as, even with alpha-beta pruning and optimal move ordering, when proving a fail low, all moves still have to be searched. However, modern engines employ numerous heuristics such as late move reductions, late move pruning, futility pruning, SEE pruning etc. which insure the ABF remains low even when considering fail lows.dangi12012 wrote: ↑Mon Dec 20, 2021 1:17 pm ...
Yes everyone agrees on that including me. I am happy with the provided information.
The thing why parallelism is garuanteed to work is that with imperfect move ordering you are garuanteed to have another thread find a better solution. Because move 4 is best and not move 0 for a position down the tree - but you find this faster if you do a parallel AB search.
I see now why a policy networks are state of the art - since these generate the herustics for ordering in a "probability of best move" order.
Therefore, LazySMP (SHT) almost always yields better Elo scaling in modern engines as compared to ABDADA/YWBC.
So LazySMP wouldnt be possible with a "black box" approach or only after modifying sourcecode?
I was looking for a distributed black box algorithm that can work with finished compiled engines and builds the movetree with evaluation itself (in parallel). So that contributing engines may even be hosted in the cloud.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
connor_mcmonigle
- Posts: 544
- Joined: Sun Sep 06, 2020 4:40 am
- Full name: Connor McMonigle
Re: Alphabeta on top of Engine cluster
Correct, LazySMP doesn't really work in the context of your original premise. There's a (now outdated afaik) cluster fork of Stockfish which used LazySMP for parallelism, but it obviously made numerous modifications to the source. As it isn't possible to parallelize a minimax with alpha-beta pruning search, better results could probably be obtained with something like a MCTS/UCT search which enables batching of calls to the evaluation function (where the "evaluation function" could be the result of a UCI engine performing a depth N search on the position). This has already been done successfully here: https://www.chessdb.cn/queryc_en/.dangi12012 wrote: ↑Mon Dec 20, 2021 9:23 pm ...
Insightful comment! Isnt LazySMP heavily reliant on a shared hashtable? So the only way to do that would be to either allow a uci command to point uci engines to an existing memory mapped memory region (that could still be over the network since the OS abstracts these details away).
So LazySMP wouldnt be possible with a "black box" approach or only after modifying sourcecode?
I was looking for a distributed black box algorithm that can work with finished compiled engines and builds the movetree with evaluation itself (in parallel). So that contributing engines may even be hosted in the cloud.
-
R. Tomasi
- Posts: 307
- Joined: Wed Sep 01, 2021 4:08 pm
- Location: Germany
- Full name: Roland Tomasi
Re: Alphabeta on top of Engine cluster
I somehow think that such a hashtable that is shared over the network will kill any advantages of lazy SMP because of network latency. My understanding is that lazy SMP works so well because you get TT hits from positions that have been explored by other threads (not only for eval, but mor importantly for move ordering). My gut feeling (yes, pure speculation) is that at least the move ordering via TT has so high latency that it voids any benefits of lazy SMP.