Thats very-very cool! Will have to think how to automatically detect these kind of pattern.tcusr wrote: ↑Mon Apr 18, 2022 10:55 pm i found the pattern
the only problem now is that we have to find a fast function that finds the byte in which a bit is placedCode: Select all
def find_byte(n): for i in range(0, 8): if (n >> 8 * i) & 0xff: return i + 1 def dir_HO(n): b = find_byte(n) return (1 << b * 8) - n - (1 << (b - 1) * 8) for n in range(0, 64): print(1 << n, " -> ", dir_HO(1 << n))
Thank you for that solution!

I also did a lot of thinking - of how to skip the countlzero that seems to always pop up.
Possible Solution:
When running any of the many algos with occ = 0 - then consequently the 4 masks will be generated!
If the algorithm itself is not dependent on the 4 masks then its a win.
A prime candidate is https://github.com/Gigantua/Chess_Moveg ... in/QBB.hpp
because of its simplicity and use of shifted squares. (1ull << sq is just he native Bitboard form of sq)
With the sifter I can find alternative forms of many patterns and can exclude countlzero.
So for example these two are equivalent:
Code: Select all
(0x0101010101010101ULL << sq) //sq = 0..64
(0x0101010101010101ULL << popcount((sq_mask - popcount(sq_mask)))) //sq_mask = 0...1 << 64

So in general I will try to find a solution that accepts 1ull << sq as input and with occ = 0 will simplify to a fast lookup-free simple function.
Two things bugged me in this thread:
That the final bishop solution has a conditional in it.
That the squares have to be in 0..64 form to generate the 4 masks.
I think we are close to a fast solution for both points above. If a solution is found here it will accelerate all Bitboard algorithms that depend on the 4 raymasks (which is incidentally the 3 fastest gpu algorithms and all of the fastest CPU algos after PEXT). So in summary a challenge worth thinking about.