Yet another bitboard attack generator

bob · Post by **bob** » Fri Oct 30, 2009 12:59 am

michiguel wrote:
bob wrote:I simply use inline asm. MSB is bsr, LSB is bsf.
That's what I figured. I generally try to avoid inline asm because my engine is not fast at all, and the only thing I do is to compromise portability. But you made me think about the possibility to have an MSB that is iterative. It may be fine because this bitboards will not be very populated.
Code: Select all
uint64_t
msb &#40;uint64_t x&#41;
&#123; 
    uint64_t y;
    do &#123;
        y = x;
        x &= x - 1;
    &#125; while &#40;x&#41;;
    return y;
&#125;
Miguel

That is what I do in Crafty, in fact. I use the simple loop for PopCnt() since only the new i7 has the hardware popcnt instruction. I also have the same kinds of C routines for non-X86 architectures (generic c code to find msb and lsb). However, most current-day machines do have a bsr/bsf type instruction and many have a built-in compiler intrinsic to get to them without inline. I'd use those but that puts me at the mercy of dealing with a dozen different compilers since some use borland, gcc, MSVC (every version back to when dirt was invented) in addition to the usual suspects from various vendors like IBM, SGI/MIPS, etc.

Engin · Post by **Engin** » Fri Oct 30, 2009 11:42 pm

Code: Select all

        h->moves&#91;id&#93; = (&#40;ray7p&#91;fs&#93; & belowInc&#91;FirstBit&#40;ray7p&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray9p&#91;fs&#93; & belowInc&#91;FirstBit&#40;ray9p&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray7m&#91;fs&#93; & aboveInc&#91;LastBit&#40;ray7m&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray9m&#91;fs&#93; & aboveInc&#91;LastBit&#40;ray9m&#91;fs&#93; & aPieces&#41;&#93;));

and how is faster if we are pre-calculate and store it in a database, then we to do only lookup ?

bob · Post by **bob** » Sat Oct 31, 2009 2:18 am

Engin wrote:

Code: Select all

        h->moves&#91;id&#93; = (&#40;ray7p&#91;fs&#93; & belowInc&#91;FirstBit&#40;ray7p&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray9p&#91;fs&#93; & belowInc&#91;FirstBit&#40;ray9p&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray7m&#91;fs&#93; & aboveInc&#91;LastBit&#40;ray7m&#91;fs&#93; & aPieces&#41;&#93;)
                     |  &#40;ray9m&#91;fs&#93; & aboveInc&#91;LastBit&#40;ray9m&#91;fs&#93; & aPieces&#41;&#93;));

and how is faster if we are pre-calculate and store it in a database, then we to do only lookup ?

How do you "just do a lookup" since the legal moves depend on the contents of squares on diagonals/ranks/files for sliders???

Engin · Post by **Engin** » Sat Oct 31, 2009 2:40 am

well for the PopCnt bits i pre calculated and do this fast with lookups

but i dont know how i can also only lookup for the sliders, without using rotated or other shifts

maybe i have an idea for pre-calculate every direction attacks for the bishop and rook, there needs 8 dirs, plus7dir, minus7dir.... this attacks we can precalculated with possible occupied bits and store in a table for lookup??

bob · Post by **bob** » Thu Jan 26, 2012 2:00 am

michiguel wrote:Gaviota uses the following functions to find the squares attacked by a given slider in a given board with blocking pieces (occ). I dropped my own implementation of rotated bitboards because the code was akward and carrying extra bitboards all over the place was a nuisance. This implementation is not faster or better, but at least it is not slower in my engine. It is certainly more compact, readable and clearer. It only needs a little supporting table. Most importantly (at least for me), It allows me to do something different than everybody else. It is more fun.

I checked CPW to see whether there is anything similar. Apparently, this would be a modification of the Dumb7fill method. The difference is that I do attacks in all directions in parallel, including the queen (with a little trick).

The method assume that we have a table
uint64_t reach [MAXPIECES] [MAXSQUARES];
where each bitboard there represents the attacks of a given slider from a given square in an empty board. I generate the table at start up with a 0x88 procedure , but it is small enough to hardwire it into the program.
You can replace this with
uint64_t reach_bishop [64];
uint64_t reach_queen [64];
uint64_t reach_rook [64];
That is about ~1.5 k

The code below is optimized for (my) readability, so I am sure that it could be improved. At least in my engine, unrolling the loops slow things down.

Miguel
PS: I hope that the code is self-explanatory.
Code: Select all
/* not_A is a bitboard with all bits turned on except on column A */
/* not_H is a bitboard with all bits turned on except on column H */

/* grow in all directions */
static uint64_t
grow &#40;uint64_t x&#41;
&#123;
	x |= &#40;x << 1&#41; & not_A; 
	x |= &#40;x >> 1&#41; & not_H;	
	x |=  x >> 8;
	x |=  x << 8;
	return x;
&#125;

/* grow laterally like a rook */
static uint64_t
growlat &#40;uint64_t x&#41;
&#123;
	uint64_t y = x;
	x |= &#40;x << 1&#41; & not_A;
	x |= &#40;x >> 1&#41; & not_H;	
	y |=  y >> 8;
	y |=  y << 8;
	return x | y;	
&#125;


extern uint64_t
B_attacks &#40;SQUARE from, uint64_t occ&#41;
&#123;
	uint64_t g, corral, prev, r, ini;
	r = reach&#91;BISHOP&#93;&#91;from&#93;;
	corral = r & ~occ;
	ini = U64&#40;1&#41; << from;

	g = ini;
	do  &#123;
		prev = g;
		g = grow &#40;prev&#41; & corral;
	&#125; while &#40;prev != g&#41;;
	g |= ini;
	
	return &#40;grow&#40;g&#41; & r&#41; & ~ini;
&#125;

extern uint64_t
R_attacks &#40;SQUARE from, uint64_t occ&#41;
&#123;
	uint64_t g, corral, prev, r, ini;
	r = reach&#91;ROOK&#93;&#91;from&#93;;
	corral = r & ~occ;
	ini = U64&#40;1&#41; << from;

	g = ini;
	do  &#123;
		prev = g;
		g = grow &#40;prev&#41; & corral;
	&#125; while &#40;prev != g&#41;;
	g |= ini;
	
	return &#40;grow&#40;g&#41; & r&#41; & ~ini;
&#125;

extern uint64_t
Q_attacks &#40;SQUARE from, uint64_t occ&#41;
&#123;
	uint64_t g, corral, prev, r, ini, h;

	r = reach&#91;QUEEN&#93;&#91;from&#93;;
	ini = U64&#40;1&#41; << from;
	
	g = grow &#40;ini&#41; & ~ini;
	h = growlat &#40;g & occ&#41;;
	h &= ~g;
	
	corral = r & ~occ;
	corral &= ~h;

	g &= ~occ;

	do  &#123;
		prev = g;
		g = grow &#40;prev&#41; & corral;
	&#125; while &#40;prev != g&#41;;
	g |= ini;
	return &#40;grow&#40;g&#41; & r&#41; & ~ini & ~h ;
&#125;

There's another very simple version I have used. I was once asked to do a version of a chess program for the Nintendo DS. Very restricted platform so no bit rotated arrays. I did this kind of thing:

# define AttacksBishop(square, occ) \
(plus7dir[square] ^ plus7dir[LSB(plus7dir[square] & (occ))] | \
plus9dir[square] ^ plus9dir[LSB(plus9dir[square] & (occ))] | \
minus7dir[square] ^ minus7dir[MSB(minus7dir[square] & (occ))] | \
minus9dir[square] ^ minus9dir[MSB(minus9dir[square] & (occ))])

The 4 arrays plus7dir, etc are pretty obvious. Just 1 bits starting in the right direction and ending at the edge of the board. You then need to identify the blocker with "LSB(plus7dir[square] & occ) (note that you use LSB for two and MSB for the other two. Then you use that same "direction" again, but move down to the blocking square and turn off all bits beyond it.

With the above, for rooks, queens and bishops, you need 8 64 element vectors, for the 8 possible directions. I think it was maybe 10% slower than rotated when I tested it, but don't hold me to that. You may well be doing something similar. Nice thing about the macro is that the conversion to rotated doesn't change any other code, just the 3 macros for bishops, rooks and queens... That's why magic conversion took me all of 30 minutes, total...

Part of that was creating a magic index for my mobility scores, in addition to just looking up an attack vector...

bob · Post by **bob** » Thu Jan 26, 2012 2:02 am

Engin wrote:well for the PopCnt bits i pre calculated and do this fast with lookups

but i dont know how i can also only lookup for the sliders, without using rotated or other shifts

maybe i have an idea for pre-calculate every direction attacks for the bishop and rook, there needs 8 dirs, plus7dir, minus7dir.... this attacks we can precalculated with possible occupied bits and store in a table for lookup??

You are approaching magic or rotated with that idea. It is a good one, but the arrays get pretty large. You can make them smaller than you think, because the two end bits are never required when you think about it, so that rather than a 256-element set you only need 64 elements since the slider attacks the last square in either direction whether it is occupied or not...

Yet another bitboard attack generator

Re: Yet another bitboard attack generator

Re: Yet another bitboard attack generator

Re: Yet another bitboard attack generator

Re: Yet another bitboard attack generator

Re: Yet another bitboard attack generator

Re: Yet another bitboard attack generator