New simple and fast bitboard move generator
Moderators: hgm, Rebel, chrisw
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: New simple and fast bitboard move generator
Also it would seem, OIMBFOS, that better parallel execution at the end may be achived by ORing two of the direction indexs together and then the other two and finally those two results.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 2250
- Joined: Wed Mar 08, 2006 8:47 pm
- Location: Hattingen, Germany
Re: New simple and fast bitboard move generator
Yes, very good - that would work!Michael Sherwin wrote:Would this fix the problem?
The corner squares do not matter as far as occupancy goes, so just make sure the corner bits are set.
Also make sure the appropriate corner bit is set for the direction bits.
Now there will always be at least one set bit to find!
And it takes less instructions!!
Code: Select all
typedef struct { u16 idx[64]; u64 bitsN; u64 bitsE; u64 bitsS; u64 bitsW; } square; u64 rookAttacks(u64 occ, square *sq) { unsigned long ln, le, ls, lw; occ |= fourCorners; _BitScanForward64(&ln, occ & sq->bitsN); _BitScanForward64(&le, occ & sq->bitsE); _BitScanReverse64(&ls, occ & sq->bitsS); _BitScanReverse64(&lw, occ & sq->bitsW); u32 idx = sq->idx[ln] | sq->idx[le] | sq->idx[ls] | sq->idx[lw]; return rookLookup[idx];
Re: New simple and fast bitboard move generator
Hmm, don't think so.Gerd Isenberg wrote:Yes, very good - that would work!Michael Sherwin wrote:Would this fix the problem?
The corner squares do not matter as far as occupancy goes, so just make sure the corner bits are set.
Also make sure the appropriate corner bit is set for the direction bits.
Now there will always be at least one set bit to find!
And it takes less instructions!!
Code: Select all
typedef struct { u16 idx[64]; u64 bitsN; u64 bitsE; u64 bitsS; u64 bitsW; } square; u64 rookAttacks(u64 occ, square *sq) { unsigned long ln, le, ls, lw; occ |= fourCorners; _BitScanForward64(&ln, occ & sq->bitsN); _BitScanForward64(&le, occ & sq->bitsE); _BitScanReverse64(&ls, occ & sq->bitsS); _BitScanReverse64(&lw, occ & sq->bitsW); u32 idx = sq->idx[ln] | sq->idx[le] | sq->idx[ls] | sq->idx[lw]; return rookLookup[idx];
For east and west, one should use the A and H file. For N & S, 1st and 2nd rank.
It still would go wrong if the rook is on the border squares ? ie N attacks for B8 ?
Cheers,
Tony
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: New simple and fast bitboard move generator
I think that it would be okay.Tony wrote:Hmm, don't think so.Gerd Isenberg wrote:Yes, very good - that would work!Michael Sherwin wrote:Would this fix the problem?
The corner squares do not matter as far as occupancy goes, so just make sure the corner bits are set.
Also make sure the appropriate corner bit is set for the direction bits.
Now there will always be at least one set bit to find!
And it takes less instructions!!
Code: Select all
typedef struct { u16 idx[64]; u64 bitsN; u64 bitsE; u64 bitsS; u64 bitsW; } square; u64 rookAttacks(u64 occ, square *sq) { unsigned long ln, le, ls, lw; occ |= fourCorners; _BitScanForward64(&ln, occ & sq->bitsN); _BitScanForward64(&le, occ & sq->bitsE); _BitScanReverse64(&ls, occ & sq->bitsS); _BitScanReverse64(&lw, occ & sq->bitsW); u32 idx = sq->idx[ln] | sq->idx[le] | sq->idx[ls] | sq->idx[lw]; return rookLookup[idx];
For east and west, one should use the A and H file. For N & S, 1st and 2nd rank.
It still would go wrong if the rook is on the border squares ? ie N attacks for B8 ?
Cheers,
Tony
A rook on B8 would have the north direction bit for H8 set and that corner bit is set, so the bitscan would find only that bit for north. Then in the index lookup for bit 63 would be a 0 (no contribution towards the index).
A rook on H8 would also have bit 63 set in the north direction and once again that index lookup will be a 0.
In actuality only 'twoCorners are needed, bit 0 and bit 63, but fourCorners sounded better!
Further, a rook on A1 would then have its S and W directions 0 bit set and idx[0] returns a 0. And for N and E bit 63 is set and idx[63] always returns a 0 (if it gets that far with out first finding a set bit).
Please correct me if I am wrong!
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Re: New simple and fast bitboard move generator
Ah, ok. I didn't get that the "& sq->bitsN" for B8 would leave the corner squares set.Michael Sherwin wrote:I think that it would be okay.Tony wrote:Hmm, don't think so.Gerd Isenberg wrote:Yes, very good - that would work!Michael Sherwin wrote:Would this fix the problem?
The corner squares do not matter as far as occupancy goes, so just make sure the corner bits are set.
Also make sure the appropriate corner bit is set for the direction bits.
Now there will always be at least one set bit to find!
And it takes less instructions!!
Code: Select all
typedef struct { u16 idx[64]; u64 bitsN; u64 bitsE; u64 bitsS; u64 bitsW; } square; u64 rookAttacks(u64 occ, square *sq) { unsigned long ln, le, ls, lw; occ |= fourCorners; _BitScanForward64(&ln, occ & sq->bitsN); _BitScanForward64(&le, occ & sq->bitsE); _BitScanReverse64(&ls, occ & sq->bitsS); _BitScanReverse64(&lw, occ & sq->bitsW); u32 idx = sq->idx[ln] | sq->idx[le] | sq->idx[ls] | sq->idx[lw]; return rookLookup[idx];
For east and west, one should use the A and H file. For N & S, 1st and 2nd rank.
It still would go wrong if the rook is on the border squares ? ie N attacks for B8 ?
Cheers,
Tony
A rook on B8 would have the north direction bit for H8 set and that corner bit is set, so the bitscan would find only that bit for north. Then in the index lookup for bit 63 would be a 0 (no contribution towards the index).
A rook on H8 would also have bit 63 set in the north direction and once again that index lookup will be a 0.
In actuality only 'twoCorners are needed, bit 0 and bit 63, but fourCorners sounded better!
Further, a rook on A1 would then have its S and W directions 0 bit set and idx[0] returns a 0. And for N and E bit 63 is set and idx[63] always returns a 0 (if it gets that far with out first finding a set bit).
Please correct me if I am wrong!
Tony
-
- Posts: 2250
- Joined: Wed Mar 08, 2006 8:47 pm
- Location: Hattingen, Germany
Re: New simple and fast bitboard move generator
This source with vc2005 results in assembly which looks closest to the hand made one. Other versions are either more sequential or use more than the seven scratch registers, so that it has to save/restore caller safe registers (rbx) on the stack...
This is the additional overhead to build the sq*:
Code: Select all
#pragma intrinsic(_BitScanForward64)
#pragma intrinsic(_BitScanReverse64)
typedef struct
{
u64 bitsN;
u64 bitsE;
u64 bitsS;
u64 bitsW;
unsigned short index[64];
} SQUARE;
u64 rookAttacks(u64 occ, SQUARE *sq) {
u64 n, e, s, w;
unsigned long ln, le;
occ |= 0x8100000000000081;
n = occ & sq->bitsN;
e = occ & sq->bitsE;
s = occ & sq->bitsS;
w = occ & sq->bitsW;
_BitScanForward64(&ln, n);
_BitScanForward64(&le, e);
_BitScanReverse64((unsigned long*)&s, s);
_BitScanReverse64((unsigned long*)&w, w);
u32 idx = sq->index[ln] | sq->index[le] | sq->index[s] | sq->index[w];
return rookLookup[idx];
}
occ$ = 8
sq$ = 16
?rookAttacks@@YA_K_KPEAUSQUARE@@@Z PROC
00000 4c 8b 52 10 mov r10, QWORD PTR [rdx+16]
00004 4c 8b 42 18 mov r8, QWORD PTR [rdx+24]
00008 48 b8 81 00 00
00 00 00 00 81 mov rax, 8100000000000081H
00012 48 0b c8 or rcx, rax
00015 48 8b 02 mov rax, QWORD PTR [rdx]
00018 4c 8b da mov r11, rdx
0001b 48 23 c1 and rax, rcx
0001e 4c 23 d1 and r10, rcx
00021 4c 23 c1 and r8, rcx
00024 4c 0f bc c8 bsf r9, rax
00028 48 8b 42 08 mov rax, QWORD PTR [rdx+8]
0002c 4d 0f bd c0 bsr r8, r8
00030 48 23 c1 and rax, rcx
00033 4d 0f bd d2 bsr r10, r10
00037 48 0f bc c8 bsf rcx, rax
0003b 0f b7 44 4a 20 movzx eax, WORD PTR [rdx+rcx*2+32]
00040 43 0f b7 4c 43
20 movzx ecx, WORD PTR [r11+r8*2+32]
00046 42 0f b7 54 4a
20 movzx edx, WORD PTR [rdx+r9*2+32]
0004c 48 0b c2 or rax, rdx
0004f 48 0b c1 or rax, rcx
00052 43 0f b7 4c 53
20 movzx ecx, WORD PTR [r11+r10*2+32]
00058 48 0b c1 or rax, rcx
0005b 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:?rookLookup
00062 48 8b 04 c1 mov rax, QWORD PTR [rcx+rax*8]
00066 c3 ret 0
?rookAttacks@@YA_K_KPEAUSQUARE@@@Z ENDP
; Function compile flags: /Ogtpy
Code: Select all
SQUARE squares[64];
u64 rookAttacks(u64 occ, u32 sq) {
return rookAttacks(occ, squares+sq);
}
occ$ = 8
sq$ = 16
?rookAttacks@@YA_K_KI@Z PROC ; rookAttacks, COMDAT
00000 8b c2 mov eax, edx
00002 4c 8d 14 80 lea r10, QWORD PTR [rax+rax*4]
00006 48 8d 05 00 00
00 00 lea rax, OFFSET FLAT:?squares
0000d 49 c1 e2 05 shl r10, 5
00011 4c 03 d0 add r10, rax
00014 48 b8 81 00 00
00 00 00 00 81 mov rax, 8100000000000081H
0001e 4d 8b 42 18 mov r8, QWORD PTR [r10+24]
00022 4d 8b 4a 10 mov r9, QWORD PTR [r10+16]
00026 48 0b c8 or rcx, rax
00029 49 8b 02 mov rax, QWORD PTR [r10]
0002c 4c 23 c1 and r8, rcx
0002f 4c 23 c9 and r9, rcx
00032 48 23 c1 and rax, rcx
00035 4d 0f bd c0 bsr r8, r8
00039 4d 0f bd c9 bsr r9, r9
0003d 48 0f bc d0 bsf rdx, rax
00041 49 8b 42 08 mov rax, QWORD PTR [r10+8]
00045 41 0f b7 54 52
20 movzx edx, WORD PTR [r10+rdx*2+32]
0004b 48 23 c1 and rax, rcx
0004e 48 0f bc c8 bsf rcx, rax
00052 41 0f b7 44 4a
20 movzx eax, WORD PTR [r10+rcx*2+32]
00058 43 0f b7 4c 42
20 movzx ecx, WORD PTR [r10+r8*2+32]
0005e 48 0b c2 or rax, rdx
00061 48 0b c1 or rax, rcx
00064 43 0f b7 4c 4a
20 movzx ecx, WORD PTR [r10+r9*2+32]
0006a 48 0b c1 or rax, rcx
0006d 48 8d 0d 00 00
00 00 lea rcx, OFFSET FLAT:?rookLookup
00074 48 8b 04 c1 mov rax, QWORD PTR [rcx+rax*8]
00078 c3 ret 0
?rookAttacks@@YA_K_KI@Z ENDP
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: New simple and fast bitboard move generator
Thanks Gerd,
Mike
Mike
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through