PEXT Bitboards

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: PEXT Bitboards

Post by lucasart »

To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: PEXT Bitboards

Post by Rein Halbersma »

lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
For move generation, I agree,: magic should be faster than software PEXT. But you can also use PDEP / PEXT when computing endgame database indices when there are multiple pieces of one kind, see Ronald's posts. Still a niche use case in chess, but the default for draughts endgames.
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: PEXT Bitboards

Post by Gerd Isenberg »

Rein Halbersma wrote:
lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
For move generation, I agree,: magic should be faster than software PEXT. But you can also use PDEP / PEXT when computing endgame database indices when there are multiple pieces of one kind, see Ronald's posts. Still a niche use case in chess, but the default for draughts endgames.
Also for didactic purpose - very nice!
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: PEXT Bitboards

Post by jwes »

lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
Of course, the main reason for such a routine would be to make your program compatible with older processors.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: PEXT Bitboards

Post by syzygy »

Rein Halbersma wrote:
lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
For move generation, I agree,: magic should be faster than software PEXT. But you can also use PDEP / PEXT when computing endgame database indices when there are multiple pieces of one kind, see Ronald's posts. Still a niche use case in chess, but the default for draughts endgames.
My tablebase generator uses pdep/pext only for move/attack generation. It would be interesting to find other uses.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: PEXT Bitboards

Post by Rein Halbersma »

Rein Halbersma wrote: For machines without BMI2 but with SSE 4.2, it should be faster to only loop over the bits bits to be extracted, and for each such bit count the number of mask bits behind it (performance is untested, but should be correct at least, hopefully)

Code: Select all

uint64 _pext_u64(uint64 val, uint64 mask) 
{
  uint64 res = 0; 
  for (uint64 src = val & mask; src; src &= src - 1)
  {
    res |= bit[popcnt(((src & -src) - 1) & mask)];
  }
  return res; 
}
The corresponding PDEP emulation using popcnt is left as an exercise for the reader :D (I am not near a machine to test it for the moment, and it's slightly more involved than the PEXT emulation, will post something next week)
Here's my untested pencil and paper PDEP attempt that loops over the 1-bits to be deposited, computing the number of skipped 0-bits in val (using the SSE 4.2 ctzll intrinsic), and doing the same number of skips over the 1-bits in mask

Code: Select all

uint64 _pdep_u64(uint64 val, uint64 mask) 
{
  uint64 res = 0; 
  for (int skipped = 0; val && mask; val &= val - 1, mask &= mask - 1)
  {
    for (int skip = ctzll(val) - skipped, skipped += skip; skip > 0; --skip) 
    {
       mask &= mask - 1;
    }
    res |= mask & -mask;
  }
  return res; 
}
Note that these are general routines, and both routines can be slightly simplified in the practical case where val is a subset of mask (for PEXT, src simply equals val in that case, saving that one initialization) or if the highest bit of val is not larger than that of mask (in the case of PDEP, the && mask in the outer loop condition can be omitted). For the cases of move generation and endgame database indexing these conditions always hold.
Last edited by Rein Halbersma on Mon Jul 17, 2017 10:25 pm, edited 1 time in total.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: PEXT Bitboards

Post by Rein Halbersma »

Your indexing formula does not take care of interfering pieces. I know this has other benefits because your index factors per piece type can vary independently, and that the larger index range is almost entirely compressed away in the end result. But in draughts, computing the 4 vs 4 piece databases gives just too much overhead during generation. You can then index eg the black pieces in the usual way, but use PEXT(wp, ~bp) to index the white pieces in the bitboard of squares not occupied by black pieces. Similarly using PDEP when de-indexing (placing pieces on the board).
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: PEXT Bitboards

Post by syzygy »

Rein Halbersma wrote:Your indexing formula does not take care of interfering pieces.
The indexing function used during generation indeed does not take care of interfering pieces or of mutiple pieces of the same kind. The indexing function for the compressed format does both, at least in most cases.

The pdep/pext instructions might indeed be useful for converting a not-so-dumb index into a bitboard of occupied squares during generation.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: PEXT Bitboards

Post by lucasart »

jwes wrote:
lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
Of course, the main reason for such a routine would be to make your program compatible with older processors.
You didn't understand my point.

PEXT is used in chess engines as a replacement for magic mask/multiply/shift operation. What it does is map the (masked) occupancy into an index of N bits, where N=popcnt(mask).

But software PEXT is useless, unless it's as fast (or faster) than magics. On PEXT processors, of course, you should use hardware PEXT (the fastest, faster than magics). But on PEXTless processors, you still should use magics, because they are faster than software PEXT.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: PEXT Bitboards

Post by jwes »

lucasart wrote:
jwes wrote:
lucasart wrote:To make software PEXT useful for chess, it needs to be on par with magic multiplication. And I really don't think the above will be.

So, the choice is still between hardware PEXT and Magic.
Of course, the main reason for such a routine would be to make your program compatible with older processors.
You didn't understand my point.

PEXT is used in chess engines as a replacement for magic mask/multiply/shift operation. What it does is map the (masked) occupancy into an index of N bits, where N=popcnt(mask).

But software PEXT is useless, unless it's as fast (or faster) than magics. On PEXT processors, of course, you should use hardware PEXT (the fastest, faster than magics). But on PEXTless processors, you still should use magics, because they are faster than software PEXT.
It's a choice of how to spend your time. I would rather make my program stronger on newer processers while doing the minimum necessary to make it run on older processors. I went from rotated bitmaps to PEXT and have no desire to implement magic bitmaps.