Ok guys i have removed all the allocation on the fly during the search and during the movement generator, but i still having similar results in terms of performance inside the AI
i even tried with just a plain alpha-beta:
Code: Select all
private int NegamaxMinimal(int _alpha , int _beta, int _depth)
{
if (_depth <= 0) return Evaluate();
searched_nodes++;
sbyte targetIndex = board.GenerateMoves(moves_list[ply]);
int score = 0;
for(byte moveIndex = 0; moveIndex <= targetIndex; moveIndex++)
{
if (board.MakeMove(moves_list[ply][moveIndex]))
{
ply++;
score = -NegamaxMinimal(-_beta, -_alpha, _depth -1);
ply--;
board.UnMakeMove(moves_list[ply][moveIndex]);
if (score >= _beta ) return _beta;
if (score > _alpha)
{
pv[ply] = moves_list[ply][moveIndex];
_alpha = score;
}
}
}
return _alpha;
}
and the results are very similar in performance (300.000 positions in 3-4 seconds)
visual studio points to this function call inside the evaluate to have heavy cpu impact:
the function implementation:
Code: Select all
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static byte GetLS1BIndex(ulong _bitboard)
{
return CountBits((_bitboard & (0 -_bitboard)) - 1);
}
[MethodImpl(MethodImplOptions.AggressiveInlining)]
public static byte CountBits(ulong _bitboard)
{
const ulong c1 = 0x_55555555_55555555ul;
const ulong c2 = 0x_33333333_33333333ul;
const ulong c3 = 0x_0F0F0F0F_0F0F0F0Ful;
const ulong c4 = 0x_01010101_01010101ul;
_bitboard -= (_bitboard >> 1) & c1;
_bitboard = (_bitboard & c2) + ((_bitboard >> 2) & c2);
_bitboard = (((_bitboard + (_bitboard >> 4)) & c3) * c4) >> 56;
return (byte)_bitboard;
}
that is the fastest solution of doing that that i've came for now... i know that in .net 5 there is an API based on cpu architecture to do that inside a native way, but i can't use .net 5 since unity does not support that i have to stick with unity 4.7
any ideas? :/