Debug help - getting bitboard attacks.
Moderators: hgm, Rebel, chrisw
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
Forgot to mention that in programs for direct computation, the make/unmake had the codes for updating of rotated boards deleted. So the comparison should be accurated.
Maybe someone else with rotated bitboard could compare their versions.
Rasjid.
Maybe someone else with rotated bitboard could compare their versions.
Rasjid.
-
- Posts: 879
- Joined: Mon Dec 15, 2008 11:45 am
Re: Debug help - getting bitboard attacks.
Hi Rasjid,
well some questions for you, but before i have to excuse again
for the buggy codes posted.(saturday night fever )
I just want to answer your question about the last line
_ return( lne & (2*hi+lo)) _. So in hope not mixing up our codes,
not using unproved macros ... i will just comment the lineAttack
again, especially the last line.(so this code is not compiled,debugged
or similar,but should work. With description it should now be
easy to find a bug if it exists. Of course this can be written more
compact but description was more important to me)
hopefully no further bug included in this (pseudo)code.
It will let you find the bugs posted before, like missing negation...
- So now my questions. On what target plattform do you test the speed,
64bit or 32bit ?
- Would you be so nice to describe your _lineattack_ functions, like i tried to do now ?
Finally i want to give you an optimization template you can use
independant if you use your approach,OD, some others maybe
This is just caching the results, the higher the hitrate the higher
the gain. I dont want to swear, but it gave me at least 10% speedgain.
Maybe you want to try it. (this is not debugged code, so look carefully for
some unintended mistakes,but the principal is clear i think)
Michael
well some questions for you, but before i have to excuse again
for the buggy codes posted.(saturday night fever )
I just want to answer your question about the last line
_ return( lne & (2*hi+lo)) _. So in hope not mixing up our codes,
not using unproved macros ... i will just comment the lineAttack
again, especially the last line.(so this code is not compiled,debugged
or similar,but should work. With description it should now be
easy to find a bug if it exists. Of course this can be written more
compact but description was more important to me)
Code: Select all
ui64_t lineAttack(ui64_t lne,ui64_t occ,int sq)
{
//negative ray occupancies along the line
//LO(sq) == all bits below sq
lo = LO(sq) & lne & occ;
//ms1b of negative ray.We need
//an obstruction,so at least bit 0 is set
lo = (ui64_t)1 << bsr64(lo|1);
//swap sign, so we can add finally.
//this is an option,if you let it out
//you finally have to substract it
//instead of adding it.
lo = -lo;/*OPTION*/
//positive ray occupancies along the line
//HI(sq) == all bits above sq
hi = HI(sq) & lne & occ;
//ls1b of positiv ray
hi&=-hi;
//because we will substract hi-lo, we want
//to include the ls1b,so we shift it by one.
hi<<=1; //hi*= 2;
//finally substract ms1b positiv ray from ls1b negativ ray
//and _and_ with line.Due to the _option_ lea instruction
//optimization is done by the assembler.
return( lne & (hi+lo));/*OPTION*/
// return( lne & (hi-lo));
}
It will let you find the bugs posted before, like missing negation...
- So now my questions. On what target plattform do you test the speed,
64bit or 32bit ?
- Would you be so nice to describe your _lineattack_ functions, like i tried to do now ?
Finally i want to give you an optimization template you can use
independant if you use your approach,OD, some others maybe
Code: Select all
__inline ui64_t rook(ui64_t occ,int sq)
{
static ui64_t attOld[64] =
{
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1
};
static ui64_t bloOld[64];
if((attOld[sq]&occ) != bloOld[sq])
{
attOld[sq] = lineAttack(mask8[sq],occ,sq)
| lineAttack(mask1[sq],occ,sq);
bloOld[sq] = attOld[sq] & occ;
}
return(attOld[sq]);
}
the gain. I dont want to swear, but it gave me at least 10% speedgain.
Maybe you want to try it. (this is not debugged code, so look carefully for
some unintended mistakes,but the principal is clear i think)
Michael
-
- Posts: 155
- Joined: Mon Feb 15, 2010 9:33 am
- Location: New Zealand
Re: Debug help - getting bitboard attacks.
Presumably "& allBits" should be "& ALLBITS". But I don't see why you have it. It is removed by an optimized compilation.Chan Rasjid wrote: // my straightforward bitwise methodCode: Select all
#define HI(SQ) ( ~((u64)1) << (SQ)) #define LO(SQ) (((u64)(1) << (SQ)) - 1) #define ALLBITS 0xFFFFFFFFFFFFFFFFUL u64 attacks_diag7(const int sq) { u64 u, d = board[sq]->bDiag7; if (d); else return 0; //v high terminal bit u64 v = d & allBits & HI(sq); v = v ? v & - v : 0x8000000000000000UL; //u low terminal bit u = d & allBits & LO(sq); u = u ? BB(lastone(u)) : 1; return d & (((v - 1) | v) ^ (u - 1)); }
Robert P.
-
- Posts: 373
- Joined: Wed Mar 22, 2006 10:17 am
- Location: Novi Sad, Serbia
- Full name: Karlo Balla
Re: Debug help - getting bitboard attacks.
I tried it a few months ago, and the idea is great! This is especially useful for those with slow attack generator.Desperado wrote:This is just caching the results, the higher the hitrate the higherCode: Select all
__inline ui64_t rook(ui64_t occ,int sq) { static ui64_t attOld[64] = { -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1 }; static ui64_t bloOld[64]; if((attOld[sq]&occ) != bloOld[sq]) { attOld[sq] = lineAttack(mask8[sq],occ,sq) | lineAttack(mask1[sq],occ,sq); bloOld[sq] = attOld[sq] & occ; } return(attOld[sq]); }
the gain. I dont want to swear, but it gave me at least 10% speedgain.
Maybe you want to try it. (this is not debugged code, so look carefully for
some unintended mistakes,but the principal is clear i think)
Michael
Best Regards,
Karlo Balla Jr.
Karlo Balla Jr.
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
My allbits is the occupied bits of the board. I think both sets of codes are ok as shown by the search results. It does not need ALLBITS. I''ll check again if optimization would change or eliminate anything.micron wrote:Presumably "& allBits" should be "& ALLBITS". But I don't see why you have it. It is removed by an optimized compilation.Chan Rasjid wrote: // my straightforward bitwise methodCode: Select all
#define HI(SQ) ( ~((u64)1) << (SQ)) #define LO(SQ) (((u64)(1) << (SQ)) - 1) #define ALLBITS 0xFFFFFFFFFFFFFFFFUL u64 attacks_diag7(const int sq) { u64 u, d = board[sq]->bDiag7; if (d); else return 0; //v high terminal bit u64 v = d & allBits & HI(sq); v = v ? v & - v : 0x8000000000000000UL; //u low terminal bit u = d & allBits & LO(sq); u = u ? BB(lastone(u)) : 1; return d & (((v - 1) | v) ^ (u - 1)); }
Robert P.
My bitwise method is rather crude and nothing sophisticated, just using the fact that subtracting single bits u from a higher bit v gives bits in between less v.
Rasjid
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
My platform is Athlon64 3500+ single cpu.Desperado wrote:Hi Rasjid,
well some questions for you, but before i have to excuse again
for the buggy codes posted.(saturday night fever )
I just want to answer your question about the last line
_ return( lne & (2*hi+lo)) _. So in hope not mixing up our codes,
not using unproved macros ... i will just comment the lineAttack
again, especially the last line.(so this code is not compiled,debugged
or similar,but should work. With description it should now be
easy to find a bug if it exists. Of course this can be written more
compact but description was more important to me)
hopefully no further bug included in this (pseudo)code.Code: Select all
ui64_t lineAttack(ui64_t lne,ui64_t occ,int sq) { //negative ray occupancies along the line //LO(sq) == all bits below sq lo = LO(sq) & lne & occ; //ms1b of negative ray.We need //an obstruction,so at least bit 0 is set lo = (ui64_t)1 << bsr64(lo|1); //swap sign, so we can add finally. //this is an option,if you let it out //you finally have to substract it //instead of adding it. lo = -lo;/*OPTION*/ //positive ray occupancies along the line //HI(sq) == all bits above sq hi = HI(sq) & lne & occ; //ls1b of positiv ray hi&=-hi; //because we will substract hi-lo, we want //to include the ls1b,so we shift it by one. hi<<=1; //hi*= 2; //finally substract ms1b positiv ray from ls1b negativ ray //and _and_ with line.Due to the _option_ lea instruction //optimization is done by the assembler. return( lne & (hi+lo));/*OPTION*/ // return( lne & (hi-lo)); }
It will let you find the bugs posted before, like missing negation...
- So now my questions. On what target plattform do you test the speed,
64bit or 32bit ?
- Would you be so nice to describe your _lineattack_ functions, like i tried to do now ?
Finally i want to give you an optimization template you can use
independant if you use your approach,OD, some others maybe
This is just caching the results, the higher the hitrate the higherCode: Select all
__inline ui64_t rook(ui64_t occ,int sq) { static ui64_t attOld[64] = { -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1 }; static ui64_t bloOld[64]; if((attOld[sq]&occ) != bloOld[sq]) { attOld[sq] = lineAttack(mask8[sq],occ,sq) | lineAttack(mask1[sq],occ,sq); bloOld[sq] = attOld[sq] & occ; } return(attOld[sq]); }
the gain. I dont want to swear, but it gave me at least 10% speedgain.
Maybe you want to try it. (this is not debugged code, so look carefully for
some unintended mistakes,but the principal is clear i think)
Michael
I'll need a little time to go through your codes. As Karlo seems to have very good results, I would not be surprised if your cached OD method is much faster than my rotated bitboard. OD could be optimized further probably with zero cost with pre-calculated HI and LO.
I remember Bob mentioned that he found magic bitboards did not give him any speedup. Most who choose magics probably do so because it is the "in-thing" to do.
Rasjid
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
I think I might have to go back to basic school to relearn binary arithmetic or C casting.
Why is (unsigned int64) (~1) == 0xfffffffffffffffe.
C integer constant has type integer which is 32 bits.
~1 == 0xfffffffe
Casting (~1) to unsigned int64 should just add 32 higher zero bits to make 64 bits.
Very Sad
Rasjid
Why is (unsigned int64) (~1) == 0xfffffffffffffffe.
C integer constant has type integer which is 32 bits.
~1 == 0xfffffffe
Casting (~1) to unsigned int64 should just add 32 higher zero bits to make 64 bits.
Very Sad
Rasjid
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
After answering to Purves, I realized I should use (v-u) | v | u intead of ((v-1) | v)) ^ (u-1).
The new speed comparison is here with some changes also to OD codes.
Athlon64 3500+ 1cpu .
Best regards,
Rasjid.
The new speed comparison is here with some changes also to OD codes.
Athlon64 3500+ 1cpu .
//rotated > OD by 0.9%
depth 12 score cp +97 time 131329 msec nodes 54421938 nps 414393 pv 1.g3 Kd8 2.Bd3 Nxf2 3.Kxf2 Qxd5 4.Qg4 Bf8 5.Re4 c5 6.Bf4 d6 7.Qd1
//OD > bitwise by 1.1%
depth 12 score cp +97 time 132544 msec nodes 54421938 nps 410595 pv 1.g3 Kd8 2.Bd3 Nxf2 3.Kxf2 Qxd5 4.Qg4 Bf8 5.Re4 c5 6.Bf4 d6 7.Qd1
//bitwise
depth 12 score cp +97 time 134144 msec nodes 54421938 nps 405697 pv 1.g3 Kd8 2.Bd3 Nxf2 3.Kxf2 Qxd5 4.Qg4 Bf8 5.Re4 c5 6.Bf4 d6 7.Qd1
Code: Select all
#define HI(SQ) ( ~((u64)1) << (SQ))
#define LO(SQ) (((u64)(1) << (SQ)) - 1)
#define ALLBITS 0xFFFFFFFFFFFFFFFFUL
//OD
u64 attacks_diag7(const int sq) {
u64 lne = board[sq]->bDiag7, lo = lne & allBits;
u64 hi = (HI(sq) & lo);
hi &=-hi;
lo = ALLBITS << lastone((LO(sq) & lo)|1);
return lne & (2*hi+lo);
}
//my bitwise
u64 attacks_diag7(const int sq) {
u64 d = board[sq]->bDiag7, u = d & allBits;
//v high terminal bit
u64 v = (u & HI(sq)) | 0x8000000000000000UL;
v &= - v;
//u low terminal bit
u &= LO(sq);
u = BB(lastone(u|1));
return d & ((v - u) | v | u);
}
Rasjid.
-
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Debug help - getting bitboard attacks.
With caching for the 4 direction attacks, the speedup (testing just 1 position) is 2.5% - still good. Caching could possibly be even used for my rotated bitboard; I will have to see.
Best regards,
Rasjid.
Code: Select all
#define HI(SQ) ( ~((u64)1) << (SQ))
#define LO(SQ) (((u64)(1) << (SQ)) - 1)
#define ALLBITS 0xFFFFFFFFFFFFFFFFUL
u64 attacks_diag7(const int sq) {
static u64 attOld[64] = {
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1,
-1,-1,-1,-1,-1,-1,-1,-1
};
static u64 bloOld[64];
if ((attOld[sq] & allBits) != bloOld[sq]) {
u64 lne = board[sq]->bDiag7, lo = lne & allBits;
u64 hi = (HI(sq) & lo);
hi &=-hi;
lo = ALLBITS << lastone((LO(sq) & lo)|1);
attOld[sq] = lne & (2*hi+lo);
bloOld[sq] = attOld[sq] & allBits;
}
return attOld[sq];
}
Rasjid.
-
- Posts: 6401
- Joined: Thu Mar 09, 2006 8:30 pm
- Location: Chicago, Illinois, USA
Re: Debug help - getting bitboard attacks.
Be careful if you plan to build an SMP version. I had to have one of this sort of cache arrays per thread in Gaviota.Chan Rasjid wrote:With caching for the 4 direction attacks, the speedup (testing just 1 position) is 2.5% - still good. Caching could possibly be even used for my rotated bitboard; I will have to see.
Best regards,Code: Select all
#define HI(SQ) ( ~((u64)1) << (SQ)) #define LO(SQ) (((u64)(1) << (SQ)) - 1) #define ALLBITS 0xFFFFFFFFFFFFFFFFUL u64 attacks_diag7(const int sq) { static u64 attOld[64] = { -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1, -1,-1,-1,-1,-1,-1,-1,-1 }; static u64 bloOld[64]; if ((attOld[sq] & allBits) != bloOld[sq]) { u64 lne = board[sq]->bDiag7, lo = lne & allBits; u64 hi = (HI(sq) & lo); hi &=-hi; lo = ALLBITS << lastone((LO(sq) & lo)|1); attOld[sq] = lne & (2*hi+lo); bloOld[sq] = attOld[sq] & allBits; } return attOld[sq]; }
Rasjid.
Miguel