The wrong way

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: The wrong way

Post by Sven »

hgm wrote:I dislike the dumbfill code because it is repetitive in the lines in the routines for a single direction (making it more bug prone, as you might mis-count the number of repetitions), as well as needing separate routines for a lot of different directions. And O^O-2R also does not need pre-calculated data. But I don't see that as an advantage other than for speed. Bitboards heavily make use of pre-calculated tables, for the leapers if not for the sliders. If you want to use them, better get used to it.
You are wrong regarding the need of precalculated data for "Hyperbola Quintessence": https://chessprogramming.wikispaces.com ... intessence

Code: Select all

struct
{
   U64 bitMask;         // 1 << sq for convenience
   U64 diagonalMaskEx;
   U64 antidiagMaskEx;
   U64 fileMaskEx;
&#125; smsk&#91;64&#93;; // 2 KByte
Not much but also not "nothing". Dumb7fill only needs four 64-bit constants for a-/h file and 1st/8th rank, I would not call that "precalculation".
hgm wrote:'Kindergarten bitboards' would seem a much more educative first step to get acquainted with bitboards and teach you the elementary techniques. Dumbfill more seems an example of how to not do things.
If you imply that "Kindergarten bitboards" do not need precalculated data then you would be wrong again, they need about the same precalculation infrastructure as "Hyperbola Quintessence", see here: https://chessprogramming.wikispaces.com ... +Bitboards

Also the implementation of "Kindergarten bitboards" is not as simple as you state, please note the section about file attacks on that same page. It is certainly a lot more bit twiddling than the simple "dumb7fill" or similar techniques.

Few years ago Michael Hoffmann has posted a really simple attack getter solution based on direct calculation, it is one of the nicest ones to my taste and will certainly cause No headache :-) So I'd even recommend that one instead of dumb7fill if you prefer very short code ...
User avatar
hgm
Posts: 27809
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The wrong way

Post by hgm »

Sven Schüle wrote:Not much but also not "nothing". Dumb7fill only needs four 64-bit constants for a-/h file and 1st/8th rank, I would not call that "precalculation".
Well, I would not call a few masks "precalculation". What I meant is that there are no tables of pre-calculated attack sets. And note that no mask was needed for the 'EastAttacks' that I gave.
hgm wrote:If you imply that "Kindergarten bitboards" do not need precalculated data then you would be wrong again, they need about the same precalculation infrastructure as "Hyperbola Quintessence", see here: https://chessprogramming.wikispaces.com ... +Bitboards
If there is anyone wrong it is you: It would need far more pre-calculation, as you use the masked-selected and multiply-collected bits to index a table of result bitboards. I thought I clearly stated (in connection with this) that I didn't care much on whether pre-calculation is needed (as long as it fits in L1), and that wanna-be bitboard users better get used to that.
Few years ago Michael Hoffmann has posted a really simple attack getter solution based on direct calculation, it is one of the nicest ones to my taste and will certainly cause No headache :-) So I'd even recommend that one instead of dumb7fill if you prefer very short code ...
Well, codewise I consider that a lot better. Educationally, it still teaches exactly what you should never do.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: The wrong way

Post by Desperado »

Hello,


based on the O^O-2R trick the algorithm can be used for all lines (rank,file,diag,antiDiag) and does not require any lookup.

Code: Select all


bitboard_t AttackGetter&#58;&#58;generateLine&#40;square_t s,bitboard_t o,bitboard_t lineEx&#41;
&#123;
    bitboard_t lower  = Bitboard&#58;&#58;getLower&#40;s&#41; & o & lineEx;
    bitboard_t upper  = Bitboard&#58;&#58;getUpper&#40;s&#41; & o & lineEx;
    bitboard_t msb    = Bitboard&#58;&#58;getBit&#40;BitUtil&#58;&#58;scan64Reverse&#40;lower|1&#41;);
    bitboard_t lsb    = upper & -upper;
    bitboard_t oDif   = ( lsb << 1 ) - msb;
    return lineEx & oDif;
&#125;
It the snippet above it is intended to "hide" how the lineEx parameter is retrieved.
It is confusing to show a lookup for a complete direct computation algorithm. LineEx can be computed on the fly and can be precalculated of course too.

This should be easy enough to implement and to understand. Here you can find details for: Obstruction difference
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: The wrong way

Post by Desperado »

OD addresses the mentioned points you were discussing in this post:

hgm wrote:
Note that, although I wrote it myself because I thought it was obvious, I cannot be credited with the invention, as it was already known (as commonly happens with obvious things), under the name O^(O-2R) trick.

As usual, you can force the carry (borrow, really) propagation to skip bits by clearing them. (You can see this as having the slider move not along files or (anti-)diagonals, but as a raster scan along ranks after having removed all obstructions not on the ray where it should really move, and at the end masking away everything not on that ray.)

Unfortunately this only works for slides towards the MSB. Although for files and (anti-)diagonal slides you can repair that by reversing the byte order, (and at the end reversing it back), for which there is an x86 instruction. Unfortunately there aren't x86 instructions to reverse the bit order. So moving towards the LSB along a rank remains a problem.
I have not tried it yet. But the main problem I see already now is that your proposal, although it is most probably faster for the specific case you describe, is not suitable for the purpose I was recommending it for. It requires
- different solutions for different sliding directions,
- additional coding effort for some of them,
and as far as I can see without testing (but I may be wrong here) it also requires an additional infrastructure since I cannot pass the bitboard of all occupied squares on the whole board as 'bbOccupied' parameter but need to restrict it to one ray (the "eastern" ray in the given example), for which I need to provide precalculated data.

All of that is "trivial" when seeing it in the context of a mature engine. But then it is also "obsolete" since there are much faster solutions than "dumb7fill" or the improved version you have mentioned. However, in the context we are discussing in this thread I still believe that "dumb7fill" is a better compromise:
- one common solution for all 8 directions,
- no additional infrastructure or precalculated data needed,
- and therefore it can be used immediately.

Nevertheless I will take a deeper look into your partially better version as soon as I have time for it.

EDIT: O^(O-2R) is also known under the name "Hyperbola Quintessence", see the link above. The description confirms what you say regarding its use for rank attacks towards the LSB, i.e. "westAttacks" in the context above. It is obvious that "Hyperbola Quintessence" outperforms "dumb7fill" but my arguments given above are still valid. For a "professional" approach I would still vote for magic bitboards, of course.
User avatar
hgm
Posts: 27809
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: The wrong way

Post by hgm »

At the risk of being accused of thread hijacking, what do you think of the following method to exploit the O^O-2R trick:

* mask the relevant ray (6 central bits) from 'occupied'
* multiply with the square- and orientation-dependent factor to collect all bits in the upper byte, preserving their order
* shift them to the lower byte
* multiply with 0x0102040810204081 to fill the other bytes with progressively more right-shifted copies of it
* mask with 0x01020308102040FF to keep the low byte and the diagonal, the latter now containing the bits in reverse order
* Apply the O^O-2R trick, where '2R' comes from a lookup table that has a bit set in the lower byte to do the 'positive' ray, and a bit in the applicable other byte to do the 'negative' ray in the diagonal
* mask again with 0x01020408102040FF to erase the carry propagation path between the diagonal bits
* multiply with 0x0101010101010101 to re-integrate the positive and negative ray half in the upper byte
* shift it to the lower byte
* multiply with the applicable square-dependent constant to map the lower byte back to the original ray
* apply the ray mask again

For ranks the initial two and final two steps are not needed.
User avatar
Desperado
Posts: 879
Joined: Mon Dec 15, 2008 11:45 am

Re: The wrong way

Post by Desperado »

hgm wrote:At the risk of being accused of thread hijacking, what do you think of the following method to exploit the O^O-2R trick:

* mask the relevant ray (6 central bits) from 'occupied'
* multiply with the square- and orientation-dependent factor to collect all bits in the upper byte, preserving their order
* shift them to the lower byte
* multiply with 0x0102040810204081 to fill the other bytes with progressively more right-shifted copies of it
* mask with 0x01020308102040FF to keep the low byte and the diagonal, the latter now containing the bits in reverse order
* Apply the O^O-2R trick, where '2R' comes from a lookup table that has a bit set in the lower byte to do the 'positive' ray, and a bit in the applicable other byte to do the 'negative' ray in the diagonal
* mask again with 0x01020408102040FF to erase the carry propagation path between the diagonal bits
* multiply with 0x0101010101010101 to re-integrate the positive and negative ray half in the upper byte
* shift it to the lower byte
* multiply with the applicable square-dependent constant to map the lower byte back to the original ray
* apply the ray mask again

For ranks the initial two and final two steps are not needed.
Of course i'm not able to estimate only from mind.
So the question is, if after step 3 in your description ( shift to lower byte ) a lookup like in Kindergarten is cheaper than to exploit the O^O-2R trick (and if it really works as described)

At the time i was thinking about OD, i nice idea was to use a reversed occupancy bitboard in combination with square^56.
This way the msb would become the lsb. But even this was not for free, i mean to reverse the occupancy bitboard.(which is finally one intrinsic function _byteswap_uint64)

If the AttackGetter is only(mainly) used for the current board, it may pay of to keep a reversed occupancy bitboard in the position structure.
Unfortunatelly a advanced engine will use the AttackGetter for "dynamic" occupancy too, like in SEE.
So, there would be the need to get a reversed occupancy on demand, which is costly.

So, considering the "_byteswap_uint64" idea as too costly to extract the msb in the given context, exploiting the O^O-2R trick in the way you described it seems also to be too expensive.
(Just looking at how many steps/instructions this would take. I may be wrong of course)

But ok, that are just thoughts by reading your description.
Aleks Peshkov
Posts: 892
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia

Re: The wrong way

Post by Aleks Peshkov »

I use branchless Hyperbola attack getter, that can be used even for leapers.

Code: Select all

    static _t hyperbola&#40;_t v&#41; &#123; return v ^ &#58;&#58;bitReverse&#40;v&#41;; &#125;

    explicit operator Bb () const &#123; return Bb&#123; static_cast<Bb&#58;&#58;_t>( _mm_cvtsi128_si64&#40;this->_v&#41; ) &#125;; &#125;

public&#58;
    explicit ReverseBb &#40;Bb bb&#41; &#58; BitArray&#40; hyperbola&#40;_mm_cvtsi64_si128&#40;static_cast<__int64>&#40;bb&#41;)) ) &#123;&#125;

    Bb attack&#40;SliderType type, Square from&#41; const &#123;
        const Directions& dir = direction&#91;type&#93;&#91;from&#93;;

        _t a = dir&#91;Horizont&#93; & _mm_sub_epi64&#40;this->_v & dir&#91;Horizont&#93;, singleton&#91;from&#93;);
        a ^=   dir&#91;Vertical&#93; & _mm_sub_epi64&#40;this->_v & dir&#91;Vertical&#93;, singleton&#91;from&#93;);
        a ^=   dir&#91;Diagonal&#93; & _mm_sub_epi64&#40;this->_v & dir&#91;Diagonal&#93;, singleton&#91;from&#93;);
        a ^=   dir&#91;Antidiag&#93; & _mm_sub_epi64&#40;this->_v & dir&#91;Antidiag&#93;, singleton&#91;from&#93;);

        return Bb&#123; static_cast<Bb&#58;&#58;_t>( _mm_cvtsi128_si64&#40;hyperbola&#40;a&#41;) ) &#125;;
    &#125;
bitreverse:

Code: Select all

    _t operator&#40;) (_t v&#41; const &#123;
        _t lo = _mm_shuffle_epi8&#40;lo_nibble_reverse, lo_nibble_mask & v&#41;;
        _t hi = _mm_shuffle_epi8&#40;hi_nibble_reverse, lo_nibble_mask & _mm_srli_epi16&#40;v, 4&#41;);
        return byte_swap&#40;lo ^ hi&#41;;
    &#125;

    _t byte_swap&#40;_t v&#41; const &#123;
        return _mm_shuffle_epi8&#40;v, byte_reverse&#41;;
    &#125;
It uses 4k * PieceType(3) + 1k = 13k + 64
Henk
Posts: 7220
Joined: Mon May 27, 2013 10:31 am

Re: The wrong way

Post by Henk »

Sven Schüle wrote:
Henk wrote:ok, last optimization did not work. So I will copy (if I have time) the code given with the straight and diagonal pieces. I assume it is allowed.

See if that makes it faster.
It is boring to always read the same words in your posts, telling us that XY does/did not work for you. I suggest that you come back when you actually found something that works.
For captures no measurable speed gain. Commented lines was old version.
I still have to rewrite non captures. But it looks like that will be a waste of time too.

Code: Select all

                    case ChessPiece.Sort.whiteBishop&#58;
                            captures = (&#40;Field&#41;Location&#41;.DiagonalMoves&#40;Board.Occupiers, opponentPieces&#41;;
                            var moveDictWB = (&#40;Field&#41;Location&#41;.WhiteBishopMovesDict;
                            while &#40;captures != 0&#41;
                            &#123;
                                var moveBit = captures & (~captures + 1&#41;;
                                var move = moveDictWB&#91;moveBit&#93;;
                                switch &#40;move.End.Occupier.PieceKind&#41;
                                &#123;
                                    case ChessPiece.Kind.King&#58;
                                        moves.Clear&#40;);
                                        moves.Add&#40;move&#41;;
                                        return;
                                    case ChessPiece.Kind.Pawn&#58;
                                        move.Value = ChessPiece.PB_MVVA_VALUE;
                                        break;
                                    case ChessPiece.Kind.Knight&#58;
                                        move.Value = ChessPiece.NB_MVVA_VALUE;
                                        break;
                                    case ChessPiece.Kind.Bishop&#58;
                                        move.Value = ChessPiece.BB_MVVA_VALUE;
                                        break;

                                    case ChessPiece.Kind.Rook&#58;
                                        move.Value = ChessPiece.RB_MVVA_VALUE;
                                        break;

                                    case ChessPiece.Kind.Queen&#58;
                                        move.Value = ChessPiece.QB_MVVA_VALUE;
                                        break;
                                &#125;
                               
                                captures &= captures - 1;
                                moves.Add&#40;move&#41;;
                            &#125;
                            break;


                            //dir = 4;
                            //var xrayMovesB = (&#40;Field&#41;Location&#41;.dirMoves&#91;&#40;int&#41;ChessPiece.Sort.whiteBishop&#93;;
                            //var dirMovesB = Location.DirMoves&#40;ChessPiece.Sort.whiteBishop&#41;;
                            //while &#40;dir-- > 0&#41;
                            //&#123;

                            //    if (&#40;dirMovesB&#91;dir&#93; & blackPieces&#41; == 0&#41;
                            //    &#123;
                            //        // all moves in a direction don't hit any opponent pieces 
                            //        continue;
                            //    &#125;

                            //    var xrayDirMoves = xrayMovesB&#91;dir&#93;;
                            //    int n = xrayDirMoves.Count;
                            //    for &#40;int i = 0; i < n;)
                            //    &#123;
                            //        var move = xrayDirMoves&#91;i++&#93;;
                            //        // check each move, there can only be one capture in a direction 
                            //        ChessPiece capture = move.End.Occupier;
                            //        if &#40;capture == null&#41;
                            //        &#123;
                            //        &#125;
                            //        else if &#40;capture.Colour == ColorSign.Black&#41;
                            //        &#123;
                            //            switch &#40;capture.PieceKind&#41;
                            //            &#123;
                            //                case ChessPiece.Kind.King&#58;
                            //                    moves.Clear&#40;);
                            //                    moves.Add&#40;move&#41;;
                            //                    return;
                            //                case ChessPiece.Kind.Pawn&#58;
                            //                    move.Value = ChessPiece.PB_MVVA_VALUE;
                            //                    break;
                            //                case ChessPiece.Kind.Knight&#58;
                            //                    move.Value = ChessPiece.NB_MVVA_VALUE;
                            //                    break;
                            //                case ChessPiece.Kind.Bishop&#58;
                            //                    move.Value = ChessPiece.BB_MVVA_VALUE;
                            //                    break;

                            //                case ChessPiece.Kind.Rook&#58;
                            //                    move.Value = ChessPiece.RB_MVVA_VALUE;
                            //                    break;

                            //                case ChessPiece.Kind.Queen&#58;
                            //                    move.Value = ChessPiece.QB_MVVA_VALUE;
                            //                    break;
                            //            &#125;

                            //            moves.Add&#40;move&#41;;
                            //            break;
                            //        &#125;
                            //        else break;
                            //    &#125;
                            //&#125;
                            //break;
Henk
Posts: 7220
Joined: Mon May 27, 2013 10:31 am

Re: The wrong way

Post by Henk »

Shorter version for collecting captures ignoring promotions and en passant moves. It's approximately two times slower. Code must be called for sort = King .. Pawn

Code: Select all

   
                    var victims = Board.GetBits&#40;sort, opponentColorSign&#41;;
                    if &#40;victims != 0&#41;
                    &#123;
                        for &#40;ChessPiece.Kind k = ChessPiece.Kind.Pawn; k <= ChessPiece.Kind.King; k++)
                        &#123;
                            var attackers = Board.GetBits&#40;k, colorSign&#41;; 
                            AddCaptures&#40;moves, attackers, victims&#41;;
                        &#125;
                    &#125;
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: The wrong way

Post by Sven »

Henk wrote:Shorter version for collecting captures ignoring promotions and en passant moves. It's approximately two times slower. Code must be called for sort = King .. Pawn

Code: Select all

   
                    var victims = Board.GetBits&#40;sort, opponentColorSign&#41;;
                    if &#40;victims != 0&#41;
                    &#123;
                        for &#40;ChessPiece.Kind k = ChessPiece.Kind.Pawn; k <= ChessPiece.Kind.King; k++)
                        &#123;
                            var attackers = Board.GetBits&#40;k, colorSign&#41;; 
                            AddCaptures&#40;moves, attackers, victims&#41;;
                        &#125;
                    &#125;
The reason for being much slower will most probably be the implementation of AddCaptures() which you are hiding ...