DIRECT BITBOARD MOVEGENERATION

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: DIRECT BITBOARD MOVEGENERATION

Post by Desperado »

Well,

reversed = (UI_64) 1 << (63-bsf64(bb))

is what i need. (uhps...63 not 64 btw :) )


Unfortunately, bsf is not that faster as we would like it. Same for lzcnt.
Furthermore i was creating the version only because to eliminate the bitscans.
And because the bit is already isolated, i thought of logical operations
(like rot,or,and...) to get the bb reversed.

i dont give up,and keep it always in mind.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: DIRECT BITBOARD MOVEGENERATION

Post by mcostalba »

Desperado wrote:Well,

reversed = (UI_64) 1 << (63-bsf64(bb))

is what i need. (uhps...63 not 64 btw :) )


Unfortunately, bsf is not that faster as we would like it. Same for lzcnt.
Furthermore i was creating the version only because to eliminate the bitscans.
And because the bit is already isolated, i thought of logical operations
(like rot,or,and...) to get the bb reversed.

i dont give up,and keep it always in mind.
Here what comes to my mind is a De Bruijn Multiplication with a following lookup that retrieve the needed bitboard directly, without passing through the square.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: DIRECT BITBOARD MOVEGENERATION

Post by Desperado »

This idea is to find somewhere else in this thread already.
Doesn t profit, because in this case we may better use
the kindgarten-bb approach.(mul,sh,lookup attacks).

And something totally different now: how can i add

PersonX wrote:

text

to my post.(doesnt see a tag for that...)
User avatar
rvida
Posts: 481
Joined: Thu Apr 16, 2009 12:00 pm
Location: Slovakia, EU

Re: DIRECT BITBOARD MOVEGENERATION

Post by rvida »

Desperado wrote: And something totally different now: how can i add

PersonX wrote:

text

to my post.(doesnt see a tag for that...)
Its Quote="PersonX" tag, ends with /Quote. Place it in [].
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: DIRECT BITBOARD MOVEGENERATION

Post by Desperado »

Thx Richard.

But not to forget beside the bit-twiddles...

Reverse OD improvements(snap for single line):
Desperado wrote:hi,

first some little changes (improvements).

Code: Select all

	 lo = msk[MSK_SWNE][sq64].flipped_hi & occ[1];	//occ[1] == bsw,doesnt need 2nd adress,saves xor
	 lo|=bR8;										          //tried bts -> was worse
	 lo&=-lo;										          //LSB
	 lo = _byteswap_uint64(lo);						    //lsb becomes msb of lo

	 hi = msk[MSK_SWNE][sq64].line_hi & occ[0];		//occ[0] == occ
	 hi&=-hi;

	 diag = msk[MSK_SWNE][sq64].line_ex & (2*hi-lo);

a little correction: it must be bH8 instead of bR8 of course...

I want to add, that this code runs 4x slower when 32 bit comiled than 64 bit compiled!
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: DIRECT BITBOARD MOVEGENERATION

Post by Desperado »

Hi again.

Now some days have been passed by and i thought about implentation of
OD (ROD). While doing this just another idea came to my mind, which i like to present. (maybe discuss and work out).

1. So, i am not really able to decide which implementation i should use,
because of so many different architectures. Well, i am interested what
guides some of you are following, making such decisions ?

2. As it turned out, there wasnt a really fast way to get the MSBs aside
from using fast machine instructions. A de bruijn _magic_ calculation
for a _single_ bit wasnt the right way with respect to the Kindergarten
approach...

But we can do a lookup for _both_ MSBs in parallel with the magic
approach.(one magic calculation, result: 2msbs)

Memory Requ. (about):

Code: Select all


   a. Index: 4K Content: 4K (1byte / entry) (maybe: pre/post masked)

       - shift sq to h8
       - magic lookup
       - shift result back to sq
 
       the byte is holding the relative shift numbers for corresponding
       directions.

       byte = (shift_nb_of_files_ww<<3) | (shift_nb_of_ranks_ss) (example for rook)

Code: Select all


      b. Index: 16K Content: 16K (1byte / entry)

      magic lookup for each square itself (not shared)

       byte = (shift_nb_of_files_ww<<3) | (shift_nb_of_ranks_ss) (example for rook)
Of course i have in mind some alternatives, but i hope these 2 are enough to show the basic idea.

Well, i am very interested in point _1._ and some remarks on it.

Thx
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: DIRECT BITBOARD MOVEGENERATION

Post by Gerd Isenberg »

Desperado wrote:Hi again.

Now some days have been passed by and i thought about implentation of
OD (ROD). While doing this just another idea came to my mind, which i like to present. (maybe discuss and work out).

1. So, i am not really able to decide which implementation i should use,
because of so many different architectures. Well, i am interested what
guides some of you are following, making such decisions ?

2. As it turned out, there wasnt a really fast way to get the MSBs aside
from using fast machine instructions. A de bruijn _magic_ calculation
for a _single_ bit wasnt the right way with respect to the Kindergarten
approach...

But we can do a lookup for _both_ MSBs in parallel with the magic
approach.(one magic calculation, result: 2msbs)

Memory Requ. (about):

Code: Select all


   a. Index: 4K Content: 4K (1byte / entry) (maybe: pre/post masked)

       - shift sq to h8
       - magic lookup
       - shift result back to sq
 
       the byte is holding the relative shift numbers for corresponding
       directions.

       byte = (shift_nb_of_files_ww<<3) | (shift_nb_of_ranks_ss) (example for rook)

Code: Select all


      b. Index: 16K Content: 16K (1byte / entry)

      magic lookup for each square itself (not shared)

       byte = (shift_nb_of_files_ww<<3) | (shift_nb_of_ranks_ss) (example for rook)
Of course i have in mind some alternatives, but i hope these 2 are enough to show the basic idea.

Well, i am very interested in point _1._ and some remarks on it.

Thx
I suggest to hide the implementation, but to keep your original idea of OD using bsr, maybe with one alternative approach (f.i. 32-bit Kindergarten BB) which you may conditionally compile on K8 or 32-bit systems.

For your other proposals, well it is to hot today.
Takes some time ;-)

Cheers,
Gerd