Parallelization questions, ABDADA or DTS?
Posted: Fri Mar 23, 2012 4:45 pm
Hello,
I'm new here in this forum.
I'm working on a chess engine with support for the UCI and XBoard/WinBoard proctols in the moment, where I do want to implement a parallel search now.
I've searched this forum and the wiki many times already, but I didn't find any final answers for my decision problem.
My engine has currently already two separate working search path implementations (each with IID, QS, extensions, reductions, LMR, and so on), one implementation is recursive and the other implementation is manual-stack-iterate.
And it has a already thread-safe on-64bit-x86-SSSE2-cmpxchg16b-128-bit-atomic-based / on-other-targets-PerSingleSlot-MultipleReaderSingleWriter-Lock-based transposition table.
The board representation is bitboard-based together with the kindergarten idea for semi-staged move generation.
The data structures are splitted already in global and per-thread (each with a own copy of the almost complete current board search state except the TT, iterate-state-machine-stack, etc.) structures.
My question is now: Which parallel algorithm I should implement now? ABDADA or DTS (since it is superior to YWBC, so far I've read it right from the CPW wiki and this forum)?
ABDADA seems to be too far simple for me for to have a good scalability in contrast to DTS. Am I right? But at least ABDADA seems to be almostly asyncron without any explict sync-points and split-points for my eyes, while DTS not.
At least my code is prepared for the parallelization, so I only still have to choose between the two algorithms, before I can implement the parallel search main code. But for this I do want to know which from the two algorirhms would be better for a future proof good scalability.
And after that work I'll extend the yet very weak mostly material+PST-only evaluation function to a responsible bit more strong level.
I hope that I get good answers here.
Regards,
Benjamin 'BeRo' Rosseaux
PS. Sorry for my possible bad english. I'm reading it more than I'm writing it
I'm new here in this forum.
I'm working on a chess engine with support for the UCI and XBoard/WinBoard proctols in the moment, where I do want to implement a parallel search now.
I've searched this forum and the wiki many times already, but I didn't find any final answers for my decision problem.
My engine has currently already two separate working search path implementations (each with IID, QS, extensions, reductions, LMR, and so on), one implementation is recursive and the other implementation is manual-stack-iterate.
And it has a already thread-safe on-64bit-x86-SSSE2-cmpxchg16b-128-bit-atomic-based / on-other-targets-PerSingleSlot-MultipleReaderSingleWriter-Lock-based transposition table.
The board representation is bitboard-based together with the kindergarten idea for semi-staged move generation.
The data structures are splitted already in global and per-thread (each with a own copy of the almost complete current board search state except the TT, iterate-state-machine-stack, etc.) structures.
My question is now: Which parallel algorithm I should implement now? ABDADA or DTS (since it is superior to YWBC, so far I've read it right from the CPW wiki and this forum)?
ABDADA seems to be too far simple for me for to have a good scalability in contrast to DTS. Am I right? But at least ABDADA seems to be almostly asyncron without any explict sync-points and split-points for my eyes, while DTS not.
At least my code is prepared for the parallelization, so I only still have to choose between the two algorithms, before I can implement the parallel search main code. But for this I do want to know which from the two algorirhms would be better for a future proof good scalability.
And after that work I'll extend the yet very weak mostly material+PST-only evaluation function to a responsible bit more strong level.
I hope that I get good answers here.
Regards,
Benjamin 'BeRo' Rosseaux
PS. Sorry for my possible bad english. I'm reading it more than I'm writing it