Bitboards ?: C# vs C++

Discussion of chess software programming and technical issues.

Moderator: Ras

Chessnut1071
Posts: 313
Joined: Tue Aug 03, 2021 2:41 pm
Full name: Bill Beame

Re: Bitboards ?: C# vs C++

Post by Chessnut1071 »

dangi12012 wrote: Wed Nov 17, 2021 7:13 pm
Chessnut1071 wrote: Wed Nov 17, 2021 7:05 pm
emadsen wrote: Wed Nov 17, 2021 5:45 pm I just realized System.Numerics.BitOperations class "provides utility methods for intrinsic bit-twiddling operations. The methods use hardware intrinsics when available on the underlying platform; otherwise, they use optimized software fallbacks." See the doc. So no need for preprocessor directives, assuming Microsoft's software fallback code performs well.

I must have introduced the preprocessor directives when I was using System.Runtime.Intrinsics.X86 prior to the availability of System.Numerics.BitOperations.
I think a lot of those numeric bit operations were put in to satisfy the chess programming community once bit boards were introduced. I didn't realize the vast number of programmers hooked on this huge time sink. It really takes you to the edge on programming techniques though.
Nah we chessprogrammers get the scraps of cryptography and AI learning and compression instructions.
If we would get an instruction it would be one instruction to get all sliding piece attacks for the board in one cycle.
But compression gifted us with PEXT and thats the fastest slider lookup currently.

Anyways my very first movegen was in C# and the naive approach with bijective dictionaries. (Attacked squares and Attacked by list for each square)
and got 3MNps.
Okay, this works, but, it's slow as molasses, maybe slower. There's too may ifs.

public static int numTrailingBinaryZeros(UInt64 n)
{
UInt64 mask = 1;
for (int i = 0; i < 63; i++, mask <<= 1)
if ((n & mask) != 0)
return i;

return 63;

}
Anybody know anything faster?
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Bitboards ?: C# vs C++

Post by dangi12012 »

Chessnut1071 wrote: Wed Nov 17, 2021 9:24 pm
dangi12012 wrote: Wed Nov 17, 2021 7:13 pm
Chessnut1071 wrote: Wed Nov 17, 2021 7:05 pm
emadsen wrote: Wed Nov 17, 2021 5:45 pm I just realized System.Numerics.BitOperations class "provides utility methods for intrinsic bit-twiddling operations. The methods use hardware intrinsics when available on the underlying platform; otherwise, they use optimized software fallbacks." See the doc. So no need for preprocessor directives, assuming Microsoft's software fallback code performs well.

I must have introduced the preprocessor directives when I was using System.Runtime.Intrinsics.X86 prior to the availability of System.Numerics.BitOperations.
I think a lot of those numeric bit operations were put in to satisfy the chess programming community once bit boards were introduced. I didn't realize the vast number of programmers hooked on this huge time sink. It really takes you to the edge on programming techniques though.
Nah we chessprogrammers get the scraps of cryptography and AI learning and compression instructions.
If we would get an instruction it would be one instruction to get all sliding piece attacks for the board in one cycle.
But compression gifted us with PEXT and thats the fastest slider lookup currently.

Anyways my very first movegen was in C# and the naive approach with bijective dictionaries. (Attacked squares and Attacked by list for each square)
and got 3MNps.
Okay, this works, but, it's slow as molasses, maybe slower. There's too may ifs.

public static int numTrailingBinaryZeros(UInt64 n)
{
UInt64 mask = 1;
for (int i = 0; i < 63; i++, mask <<= 1)
if ((n & mask) != 0)
return i;

return 63;

}
Anybody know anything faster?
Yes. This one:
https://docs.microsoft.com/en-us/dotnet ... ew=net-5.0
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Chessnut1071
Posts: 313
Joined: Tue Aug 03, 2021 2:41 pm
Full name: Bill Beame

Re: Bitboards ?: C# vs C++

Post by Chessnut1071 »

dangi12012 wrote: Wed Nov 17, 2021 9:29 pm
Chessnut1071 wrote: Wed Nov 17, 2021 9:24 pm
dangi12012 wrote: Wed Nov 17, 2021 7:13 pm
Chessnut1071 wrote: Wed Nov 17, 2021 7:05 pm
emadsen wrote: Wed Nov 17, 2021 5:45 pm I just realized System.Numerics.BitOperations class "provides utility methods for intrinsic bit-twiddling operations. The methods use hardware intrinsics when available on the underlying platform; otherwise, they use optimized software fallbacks." See the doc. So no need for preprocessor directives, assuming Microsoft's software fallback code performs well.

I must have introduced the preprocessor directives when I was using System.Runtime.Intrinsics.X86 prior to the availability of System.Numerics.BitOperations.
I think a lot of those numeric bit operations were put in to satisfy the chess programming community once bit boards were introduced. I didn't realize the vast number of programmers hooked on this huge time sink. It really takes you to the edge on programming techniques though.
Nah we chessprogrammers get the scraps of cryptography and AI learning and compression instructions.
If we would get an instruction it would be one instruction to get all sliding piece attacks for the board in one cycle.
But compression gifted us with PEXT and thats the fastest slider lookup currently.

Anyways my very first movegen was in C# and the naive approach with bijective dictionaries. (Attacked squares and Attacked by list for each square)
and got 3MNps.
Okay, this works, but, it's slow as molasses, maybe slower. There's too may ifs.

public static int numTrailingBinaryZeros(UInt64 n)
{
UInt64 mask = 1;
for (int i = 0; i < 63; i++, mask <<= 1)
if ((n & mask) != 0)
return i;

return 63;

}
Anybody know anything faster?
Yes. This one:
https://docs.microsoft.com/en-us/dotnet ... ew=net-5.0
thanks, but, been there, done that, have the t-shirt. It doesn't work on my NET Framework 4.7.3. Do you know which version I need?
User avatar
emadsen
Posts: 441
Joined: Thu Apr 26, 2012 1:51 am
Location: Oak Park, IL, USA
Full name: Erik Madsen

Re: Bitboards ?: C# vs C++

Post by emadsen »

Chessnut1071 wrote: Wed Nov 17, 2021 9:35 pm thanks, but, been there, done that, have the t-shirt. It doesn't work on my NET Framework 4.7.3. Do you know which version I need?
I don't think the bitscan or popcount CPU intrinsics are supported in .NET Framework. You have to upgrade to .NET Core 3, .NET 5, or .NET 6.

Bmi1.X64.TrailingZeroCount(UInt64) and Lzcnt.X64 indicate they're not available for .NET Framework 4.7.2.
Erik Madsen | My C# chess engine: https://www.madchess.net
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Bitboards ?: C# vs C++

Post by hgm »

Chessnut1071 wrote: Wed Nov 17, 2021 9:24 pm Okay, this works, but, it's slow as molasses, maybe slower. There's too may ifs.

public static int numTrailingBinaryZeros(UInt64 n)
{
UInt64 mask = 1;
for (int i = 0; i < 63; i++, mask <<= 1)
if ((n & mask) != 0)
return i;

return 63;

}
Anybody know anything faster?
The way people did it before CPUs had bit-scan-forward instructions was by table[(n & -n)*DEBRUIJN>>58], with a suitable initialized table, and a De Bruijn constant.
tcusr
Posts: 325
Joined: Tue Aug 31, 2021 10:32 pm
Full name: tcusr

Re: Bitboards ?: C# vs C++

Post by tcusr »

dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Bitboards ?: C# vs C++

Post by dangi12012 »

Chessnut1071 wrote: Wed Nov 17, 2021 9:35 pm thanks, but, been there, done that, have the t-shirt. It doesn't work on my NET Framework 4.7.3. Do you know which version I need?
YES IT WORKS FOR YOU TOO. You need the system.numerics nuget package. There you have RAW BMI bitoperations...
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Bitboards ?: C# vs C++

Post by dangi12012 »

hgm wrote: Wed Nov 17, 2021 10:32 pm The way people did it before CPUs had bit-scan-forward instructions was by table[(n & -n)*DEBRUIJN>>58], with a suitable initialized table, and a De Bruijn constant.
Its literally perfect hashing with the hash function X * Y >> Z which has some nice properties. Mainly its fast to implement and if its not good enough you can add a plus there too. I dont know why this community calls or many C++ people call some things DEBRUIJN - and other "Magic" while its always perfect hashing into a lookup table to begin with.

That being said if bitscanforward is available its many times faster.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Bitboards ?: C# vs C++

Post by hgm »

Well, 'many' is a bit of an exaggeration. It uses multiply + shift + lookup, instead of the single instruction bsf. But often the bit number is not what you finally want, and a lookup indexed by it would have to be used anyway. With the DeBruijn multiplier you could tabulate the desired result directly, rather than the bit number. Then it is just 3 instructions vs 2, both involving a lookup. Slower, for sure, but only marginally so.

The reason for calling the multiplier DEBRUIJN here is that there is a mathematical theory that proves there exists a constant that does achieve the perfect hashing, and provides a simple recipe for calculating one. No such theory exists for the multipliers used in magic bitboard; these have to be found by trial and error, and the hashing is usually only perfect by virtue of the fact that the hashed data is highly degenerate. In any case the multipliers there are not DeBruijn numbers.
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Bitboards ?: C# vs C++

Post by dangi12012 »

hgm wrote: Thu Nov 18, 2021 10:46 am Well, 'many' is a bit of an exaggeration. It uses multiply + shift + lookup, instead of the single instruction bsf. But often the bit number is not what you finally want, and a lookup indexed by it would have to be used anyway. With the DeBruijn multiplier you could tabulate the desired result directly, rather than the bit number. Then it is just 3 instructions vs 2, both involving a lookup. Slower, for sure, but only marginally so.

The reason for calling the multiplier DEBRUIJN here is that there is a mathematical theory that proves there exists a constant that does achieve the perfect hashing, and provides a simple recipe for calculating one. No such theory exists for the multipliers used in magic bitboard; these have to be found by trial and error, and the hashing is usually only perfect by virtue of the fact that the hashed data is highly degenerate. In any case the multipliers there are not DeBruijn numbers.
Aaaah finally something new for me. I thought that DEBRUIJN is also found by brute force. (Which only takes 1-2s) to find a factor for all masks from
1ull << 0...63 that maps into 0..63 perfectly.
Thanks!
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer