32 bit versions for bitscan64

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 32 bit versions for bitscan64

Post by mcostalba »

Desperado wrote: 1. why does bsf call itself ?
In C++ you can overload on argument type, so perhaps there is a

Code: Select all

int bsf(U32);
somewhere....
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: 32 bit versions for bitscan64

Post by Aleks Peshkov »

Code: Select all

    inline U32 lo&#40;U64 b&#41; &#123; return small_cast<U32>&#40;b&#41;; &#125;
    inline U32 hi&#40;U64 b&#41; &#123; return static_cast<U32>&#40;b >> 32&#41;; &#125;

    inline index_t bsf&#40;U64 value&#41; &#123;
        U32 lo = &#58;&#58;lo&#40;value&#41;;
        return bsf&#40;lo ? lo &#58; hi&#40;value&#41;) + &#40;lo ? 0 &#58; 32&#41;;
    &#125;
1. why does bsf call itself ?
A bsf(U64) function calls a different bsf(U32) function.

2. i cannot find a small_cast operator (although i know what it should do)
Well, it is my own operator.

Code: Select all

template <typename T>
inline T universe&#40;) &#123; return static_cast<T>(~static_cast<T>&#40;0&#41;); &#125;

template <typename T, typename N>
inline T small_cast&#40;N n&#41; &#123; return static_cast<T>&#40;n & universe<T>()); &#125;
3. and what is the issue between the _trivial_ c version and the c++ version ? In other words, what makes the c++ version more complex than the pure c version?
Sorry, I did not assume anyone take my code literaly and did not want to open discussion about C++ syntax tricks. :)
4.i thought explicit putting the (reduced)type in front of more complex type is enough for type conversion downwards.
Simple cast to smaller type should emit a warning in any compiler. And at least Visual C++ debug build tests and breaks an execution of a program when silent conversion losers significant bits.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 32 bit versions for bitscan64

Post by mcostalba »

Aleks Peshkov wrote: Well, it is my own operator.

Code: Select all

template <typename T>
inline T universe&#40;) &#123; return static_cast<T>(~static_cast<T>&#40;0&#41;); &#125;

template <typename T, typename N>
inline T small_cast&#40;N n&#41; &#123; return static_cast<T>&#40;n & universe<T>()); &#125;
Well I sometimes find myself to indulge on C++ obscurities but indeed you beat me hands down ! :-)
Simple cast to smaller type should emit a warning in any compiler. And at least Visual C++ debug build tests and breaks an execution of a program when silent conversion losers significant bits.
A cast is not a silent conversion and does not make the compiler emit any warning.

Code: Select all

void test &#40;U64 b&#41; &#123;

   U32 a = b; // This is a warning
   U32 c = &#40;U32&#41;b; // This is not !
&#125;
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 32 bit versions for bitscan64

Post by Desperado »

Thx! Aleks for the really fast reply .
(i thought of some things you mentioned, all hiden in your objects :) )

but now for marco, some _UNTESTED_ code (and maybe not really outworked code). but it should give the right direction...and maybe it can
be used like it is.

Code: Select all


BTB_T only_pop1&#40;BTB_T &bb&#41;
	&#123;
	 UI_32 *ptr = &#40;UI_32*)&bb;

	 ULONG id=64;

	 _BitScanForward&#40;&id,*&#40;ptr&#41;) ? id &#58; _BitScanForward&#40;&id,*&#40;ptr+=1&#41;);

	 return&#40;*ptr^=BB&#40;id&#41;);
	&#125;


ULONG pop_and_index1&#40;BTB_T &bb&#41;
	&#123;
	 UI_32 *ptr = &#40;UI_32*)&bb;

	 ULONG id=64;

	 _BitScanForward&#40;&id,*&#40;ptr&#41;) ? id &#58; _BitScanForward&#40;&id,*&#40;ptr+=1&#41;);
	 
	 *ptr^=&#40;BB&#40;id&#41;);

	 return&#40;ptr==&#40;UI_32*)&bb ? id &#58; 32+id&#41;;
	&#125;

This has to be validated of course (if it works, not time yet to do), and if so of course some instructions may be saved by tuning.

today no more time left for this... til tomorrow :)
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 32 bit versions for bitscan64

Post by mcostalba »

How is defined:

Code: Select all

BB&#40;id&#41;
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 32 bit versions for bitscan64

Post by Desperado »

Oh...

#define BB(SQ64) ((UI_64) 1 << (SQ64))

cheers
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 32 bit versions for bitscan64

Post by mcostalba »

At first glance this seems wrong:

Code: Select all

_BitScanForward&#40;&id,*&#40;ptr+=1&#41;); 
You may not want to update the pointer, perhaps this one:

Code: Select all

_BitScanForward&#40;&id,*&#40;ptr+1&#41;); 

Also this one is not very clear

Code: Select all

*ptr^=&#40;BB&#40;id&#41;); 
Because ptr points to a 32bit quantity while BB() is a 64 bits.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: 32 bit versions for bitscan64

Post by Desperado »

Well , i _want_ to update the pointer -> ptr+=1 !

so i can update the upper half of the 64 bit value,
(the pointer is only conditionally updated (i think), if the right exression
is executed)

and you are right, in this case instead of BB(x) there should
be used

#define BIT32(X) ((UI_32)1 << (X))...

where x is:

(0...x...31)

i have to go...
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: 32 bit versions for bitscan64

Post by mcostalba »

I have tested the below version on my Intel Core 2 Duo but with no results, perhaps a very very slightly increase but well below 1%

Code: Select all

Square pop_1st_bit&#40;Bitboard* bb&#41; &#123;

  uint32_t* ptr = &#40;uint32_t*&#41;bb;
  unsigned long id;

  if (*ptr&#41;
  &#123;
      _BitScanForward&#40;&id, *ptr&#41;;
      *ptr ^= &#40;1 << id&#41;;
  &#125;
  else
  &#123;
      ptr += 1;
      _BitScanForward&#40;&id, *ptr&#41;;
      *ptr ^= &#40;1 << id&#41;;
      id += 32;
  &#125;
  return Square&#40;id&#41;;
&#125;

This is the disassembled code (MSVC):

Code: Select all

Square pop_1st_bit&#40;Bitboard* bb&#41; &#123;
0040E760  push        ecx  

  uint32_t* ptr = &#40;uint32_t*&#41;bb;
  unsigned long id;

  if (*ptr&#41;
0040E761  mov         edx,dword ptr &#91;esi&#93;  
0040E763  push        edi  
  &#123;
      _BitScanForward&#40;&id, *ptr&#41;;
      *ptr ^= &#40;1 << id&#41;;
0040E764  mov         edi,1  
0040E769  test        edx,edx  
0040E76B  je          pop_1st_bit+1Bh &#40;40E77Bh&#41;  
0040E76D  bsf         eax,edx  
0040E770  mov         ecx,eax  
0040E772  shl         edi,cl  
0040E774  xor         edi,edx  
0040E776  mov         dword ptr &#91;esi&#93;,edi  
0040E778  pop         edi  
  &#125;
  return Square&#40;id&#41;;
&#125;
0040E779  pop         ecx  
0040E77A  ret  
  &#125;
  else
  &#123;
      ptr += 1;
      _BitScanForward&#40;&id, *ptr&#41;;
0040E77B  mov         edx,dword ptr &#91;esi+4&#93;  
0040E77E  bsf         ecx,edx  
      *ptr ^= &#40;1 << id&#41;;
0040E781  shl         edi,cl  
0040E783  mov         eax,ecx  
0040E785  xor         edi,edx  
0040E787  mov         dword ptr &#91;esi+4&#93;,edi  
      id += 32;
0040E78A  add         eax,20h  
0040E78D  pop         edi  
  &#125;
  return Square&#40;id&#41;;
&#125;
0040E78E  pop         ecx  
0040E78F  ret  
Gerd Isenberg
Posts: 2250
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: 32 bit versions for bitscan64

Post by Gerd Isenberg »

Desperado wrote:Hi Marco,

first i have to correct my last post !!!

Putting the table to global scope, everything changes!
!oups, how that? You forgot const? Have you inspected assembly?

This is how Matt's routine should look like in x86-32. If you have 64 mov [mem], someconst instructions before, you may use a char array, or use static const, but I guess const should be fine for compile time initialization.

Code: Select all

; input  edx&#58;ecx bb
; output eax bit
mov  eax, ecx
sub  ecx, 1
mov  ebx, edx
sbb  edx, 0
xor  eax, ecx
xor  eax, ebx
xor  eax, edx
imul eax, 78291ACFH
shr  eax, 26
mov  eax, dword ptr &#91;lookup + 4*eax&#93; ; movzx  eax, byte ptr &#91;lookup + eax&#93;
// ~ 12 cycles
Desperado wrote:The result seems to be equal(about)....(for _bitscan alone_)
BUT on _anti_ bitscan machine.
The bsf approach may suffer from miss-predicted branches. 32-bit bsf K7/K8 is alone 8 cycles vector path.