How can i improve my C# engine speed?

spirch · Post by **spirch** » Fri Jul 02, 2021 2:21 am

pedrojdm2021 wrote: ↑Fri Jul 02, 2021 1:43 am Yeah i have found that the issue was too many instance creations inside the search, the thing was that visual studio never pointed at these things, but Jetbrain's DotMemory does. I have some ideas in mind on how to avoid that, i'll post the results when i finish the changes

since in are using trial version, try out the dottrace too

i find it better than the built-in profiler of visual studio

pedrojdm2021 · Post by **pedrojdm2021** » Fri Jul 02, 2021 3:31 am

spirch wrote: ↑Fri Jul 02, 2021 2:21 am
pedrojdm2021 wrote: ↑Fri Jul 02, 2021 1:43 am Yeah i have found that the issue was too many instance creations inside the search, the thing was that visual studio never pointed at these things, but Jetbrain's DotMemory does. I have some ideas in mind on how to avoid that, i'll post the results when i finish the changes
since in are using trial version, try out the dottrace too

i find it better than the built-in profiler of visual studio

i thing that you understood me wrong, Jetbrain's DotMemory it worked and pointed me to the issue, while visual studio don't.

pedrojdm2021 · Post by **pedrojdm2021** » Fri Jul 02, 2021 9:58 am

Ok guys i have removed all the allocation on the fly during the search and during the movement generator, but i still having similar results in terms of performance inside the AI

i even tried with just a plain alpha-beta:

Code: Select all

private int NegamaxMinimal(int _alpha , int _beta, int _depth)
        {
            if (_depth <= 0) return Evaluate();

            searched_nodes++;

            sbyte targetIndex = board.GenerateMoves(moves_list[ply]);
            int score = 0;

            for(byte moveIndex = 0; moveIndex <= targetIndex; moveIndex++)
            {
                if (board.MakeMove(moves_list[ply][moveIndex]))
                {
                    ply++;
                    score = -NegamaxMinimal(-_beta, -_alpha, _depth -1);
                    ply--;
                    
                    board.UnMakeMove(moves_list[ply][moveIndex]);

                    if (score >= _beta ) return _beta;
                    if (score > _alpha)
                    {
                        pv[ply] = moves_list[ply][moveIndex];
                        _alpha = score;
                    }
                }
            }

            return _alpha;
        }

and the results are very similar in performance (300.000 positions in 3-4 seconds)

visual studio points to this function call inside the evaluate to have heavy cpu impact:

the function implementation:

Code: Select all


 [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static byte GetLS1BIndex(ulong _bitboard)
        {
            return CountBits((_bitboard & (0 -_bitboard)) - 1);
        }

[MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static byte CountBits(ulong _bitboard)
        {
            const ulong c1 = 0x_55555555_55555555ul;
            const ulong c2 = 0x_33333333_33333333ul;
            const ulong c3 = 0x_0F0F0F0F_0F0F0F0Ful;
            const ulong c4 = 0x_01010101_01010101ul;
            
            _bitboard -= (_bitboard >> 1) & c1;
            _bitboard = (_bitboard & c2) + ((_bitboard >> 2) & c2);
            _bitboard = (((_bitboard + (_bitboard >> 4)) & c3) * c4) >> 56;
            return (byte)_bitboard;
        }

that is the fastest solution of doing that that i've came for now... i know that in .net 5 there is an API based on cpu architecture to do that inside a native way, but i can't use .net 5 since unity does not support that i have to stick with unity 4.7

any ideas? :/

lithander · Post by **lithander** » Fri Jul 02, 2021 6:21 pm

In a modern version of Unity you could probably use the Burst compiler that provides hardware intrinsics for popcount. But in Unity 4.7 I have no idea...

pedrojdm2021 · Post by **pedrojdm2021** » Fri Jul 02, 2021 7:00 pm

lithander wrote: ↑Fri Jul 02, 2021 6:21 pm In a modern version of Unity you could probably use the Burst compiler that provides hardware intrinsics for popcount. But in Unity 4.7 I have no idea...

sorry i wrote it bad, i was talking about . Net framework 4.7 not unity 4.7

I am talking about this:
https://docs.unity3d.com/2020.3/Documen ... pport.html

Unity currently only supports .Net standard 2.0 and . Net framework up to 4.x

lithander · Post by **lithander** » Fri Jul 02, 2021 7:09 pm

Then try this:

Code: Select all

            bool isSupported = Unity.Burst.Intrinsics.X86.Popcnt.IsPopcntSupported);
            long numberOfSetBits = Unity.Burst.Intrinsics.X86.Popcnt.popcnt_u64(1337); //should return 6

The IsPopcntSupported may return false unless you use AOT and the Burst compiler. But there should be a fallback implemented and maybe the fallback is still faster than your custom code executed by Mono. Good luck!

Gerd Isenberg · Post by **Gerd Isenberg** » Fri Jul 02, 2021 7:52 pm

pedrojdm2021 wrote: ↑Fri Jul 02, 2021 9:58 am visual studio points to this function call inside the evaluate to have heavy cpu impact:

the function implementation:
Code: Select all
 [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static byte GetLS1BIndex(ulong _bitboard)
        {
            return CountBits((_bitboard & (0 -_bitboard)) - 1);
        }

[MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static byte CountBits(ulong _bitboard)
        {
            const ulong c1 = 0x_55555555_55555555ul;
            const ulong c2 = 0x_33333333_33333333ul;
            const ulong c3 = 0x_0F0F0F0F_0F0F0F0Ful;
            const ulong c4 = 0x_01010101_01010101ul;
            
            _bitboard -= (_bitboard >> 1) & c1;
            _bitboard = (_bitboard & c2) + ((_bitboard >> 2) & c2);
            _bitboard = (((_bitboard + (_bitboard >> 4)) & c3) * c4) >> 56;
            return (byte)_bitboard;
        }
that is the fastest solution of doing that that i've came for now... i know that in .net 5 there is an API based on cpu architecture to do that inside a native way, but i can't use .net 5 since unity does not support that i have to stick with unity 4.7

any ideas? :/

For GetLS1BIndex better use De Bruijn Multiplication rather than the CountBits version.

pedrojdm2021 · Post by **pedrojdm2021** » Fri Jul 02, 2021 9:09 pm

lithander wrote: ↑Fri Jul 02, 2021 7:09 pm Then try this:
Code: Select all
            bool isSupported = Unity.Burst.Intrinsics.X86.Popcnt.IsPopcntSupported);
            long numberOfSetBits = Unity.Burst.Intrinsics.X86.Popcnt.popcnt_u64(1337); //should return 6
The IsPopcntSupported may return false unless you use AOT and the Burst compiler. But there should be a fallback implemented and maybe the fallback is still faster than your custom code executed by Mono. Good luck!

Omg thank you, i didn't realized that unity had their own implementation for popcount

now i only have to reference unity brush dll into my custom engine dll

pedrojdm2021 · Post by **pedrojdm2021** » Sat Jul 03, 2021 6:03 am

ok i think that i've found the problem

the problem was mainly inside the Quiescence search:

in the old implementation of Quiescence search i was generating ALL moves (and validating the captures inside an "flag" as an optional parameter into the make_move function ), and inside the for loop i was calling the sort_moves function, so in other words: it was sorting all moves, not only captures as expected, it was generating an intense waste of computation time.

So now i changed the generate moves function to now generate only captures, and now my engine search for the best move in kiwipete position in only 3.6 seconds on the depth 9!

i compared the speed with the maksimKorzh's BBC 1.2 and mine is only 0.6 seconds slower than his implementation. so it's great news

thank you guys for all the help provided, it really helped me to find the real problem and reach the solution

How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?