Bit reversal on a 64bit word

Discussion of chess software programming and technical issues.

Moderator: Ras

Cardoso
Posts: 363
Joined: Thu Mar 16, 2006 7:39 pm
Location: Portugal
Full name: Alvaro Cardoso

Bit reversal on a 64bit word

Post by Cardoso »

Hi,
sorry for asking the basics, but after some search on the forum I didn't find a way to reverse the bits on a 64bit bitboard.
I mean bit 0 from the source goes to bit 63 on the destination bitboard, bit 7 on the source goes to bit 56 on the destination and so on.
There are several bit flipping on Chess Programming Wiki but none seems to be what I want.
Thx.
User avatar
hgm
Posts: 28353
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Bit reversal on a 64bit word

Post by hgm »

Bit reversing can be done through an O(log N) algorithm. This swaps the upper and lower halves in the first step, the 1st with the 2nd and the 3rd with the 4th quarter in the second step, etc. Something like

b = b << 32 | b >> 32;
b = (b & 0xFFFF0000FFFF0000ull) >> 16 | (b & 0x0000FFFF0000FFFFull) << 16;
b = (b & 0xFF00FF00FF00FF00ull) >> 8 | (b & 0x00FF00FF00FF00FFull) << 8;

The first 3 steps of this algorithm reverses the order of the bytes, though. And there exists a single x64 instruction that can do that. I guess you can also economize on shifts by not shifting both parts in opposit direction, but rotating (for which there also is an instruction) one by twice the amount, and the other not at all. Then you can do a single rotation at the end to properly align the result.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Bit reversal on a 64bit word

Post by dangi12012 »

Bit reversing can be done in O(1) 2x per clock (and core) with this Intel Instruction:

Code: Select all

_mm_gf2p8mul_epi8
https://builders.intel.com/docs/network ... -guide.pdf
--------------------------------------------------------------------

Bit reversing can be done in O(1) 32x per clock (and core) if you target CUDA architectures (thousands of cores):

Code: Select all

__brevll 
https://docs.nvidia.com/cuda/cuda-math- ... cf536974cd
--------------------------------------------------------------------

Bit reversing can be done in O(N log N) when targeting no special architecture:
https://godbolt.org/z/5dnfK5nc8

If anyone kows faster code than the one in compiler explorer - let me know.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer