zullil wrote:Perhaps you are right. Maybe if I only consider non-bitboard engines then my move generator might be just slow, instead of very slow. Haven't checked any perft times for non-bitboard engines, so I have no sense of what's "normal" for them.
Bitboard is usually a lot slower for perft. Especially with the perft flavor that makes/unmakes the final ply, rather than just counting generated moves.
Last time I checked, frc-perft was the fastest perft around.
Indeed, that is the one I meant when pointing ut how bitboards could be made competitive. The way it works is that it does skip generating the moves in the horizon nodes. With bitboards you can do that, because move generation is a two-step process: first you generate bitboards with to-squares (from which you can easily select sub-groups of moves, like captures or checks), and then you extract the moves to make a move list.
Frcperft has compiler switches to chose between PEFT_BULK mode (which is what qperft does, generating the move list and then taking its size) and PERFT_FAST mode. The latter (which seems default) skips generating the move list, and applies POPCNT directly to the bitboards. There is nothing analogous in mailbox. But by using tis trick it ceases to be a measurement for the speed of move generation.
On my machine it is still slower than qperft, of course.
hgm wrote:Indeed, that is the one I meant when pointing ut how bitboards could be made competitive. The way it works is that it does skip generating the moves in the horizon nodes. With bitboards you can do that, because move generation is a two-step process: first you generate bitboards with to-squares (from which you can easily select sub-groups of moves, like captures or checks), and then you extract the moves to make a move list.
Frcperft has compiler switches to chose between PEFT_BULK mode (which is what qperft does, generating the move list and then taking its size) and PERFT_FAST mode. The latter (which seems default) skips generating the move list, and applies POPCNT directly to the bitboards. There is nothing analogous in mailbox. But by using tis trick it ceases to be a measurement for the speed of move generation.
On my machine it is still slower than qperft, of course.
OK, thanks again. I'll recompile with mode=BULK just to see the difference in speed.
hgm wrote:Indeed, that is the one I meant when pointing ut how bitboards could be made competitive. The way it works is that it does skip generating the moves in the horizon nodes. With bitboards you can do that, because move generation is a two-step process: first you generate bitboards with to-squares (from which you can easily select sub-groups of moves, like captures or checks), and then you extract the moves to make a move list.
Frcperft has compiler switches to chose between PEFT_BULK mode (which is what qperft does, generating the move list and then taking its size) and PERFT_FAST mode. The latter (which seems default) skips generating the move list, and applies POPCNT directly to the bitboards. There is nothing analogous in mailbox. But by using tis trick it ceases to be a measurement for the speed of move generation.
On my machine it is still slower than qperft, of course.
Even counting all moves at the last ply without popcnt seems faster than qperft on my computer:
That is indeed faster than I would have expected. (i.e. extraction of the moves takes surprisingly little time.) Perhaps this is due to the BUILTIN extraction mode, which does do something I don't quite understand (calling some library function). The standard method would be what is called DEBRUIN here.
Perhaps I should try the incrementally updated mobility method one of these days. Problem is that this is much more suitable for pseudo-legal move generation, as you would do in engines, then for fully legal move generation, as needed in perft.