How can i improve my C# engine speed?

spirch · Post by **spirch** » Wed Jun 30, 2021 8:39 pm

bulk counting vs not bulk counting, that always confused me a little bit

this is perft(4) (count of legal move of last depth)

Code: Select all

Total Moves : 4,085,603
Time Taken  : 58ms

this is perft(4) (actually calling the move / undo of each one)

Code: Select all

Total Moves : 4,085,603
Time Taken  : 148ms

i assume first one is bulk counting and second one is what people actually do? (in my case my engine also do legal move, never pseudo move)

also C# is (now) a speedy language, .net core is even better than .net framework, there was a huge amount of optimization done
look online to see newer benchmark done, there are a lot of them comparing c++ vs c# and others

it's always about knowing how to use it, how to profile, how to optimize and how not to shoot yourself in the foot

my engine is in c# using 0x88, i keep saying this but i wish i went bitboard and right now i don't have the energy/will of converting it .. one day ... one day ...

hgm · Post by **hgm** » Wed Jun 30, 2021 10:26 pm

spirch wrote: ↑Wed Jun 30, 2021 8:39 pmi assume first one is bulk counting and second one is what people actually do? (in my case my engine also do legal move, never pseudo move)

The first one is indeed bulk couning. What people actually do depends on whether they do bulk counting or not.

emadsen · Post by **emadsen** » Thu Jul 01, 2021 1:27 am

pedrojdm2021 wrote: ↑Wed Jun 30, 2021 6:49 am For example in Kiwipete position: (r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -) it does a search of depth 9 with this results:

Nodes searched: 1286811
Search time: 12596ms ( ~13 seconds ) in a AMD FX 8350 @4ghz CPU

so it means that my engine AI it's searching ~100k positions per second, i think it is pretty bad speed
so, anyone have any ideas to improve my nps? or another pruning technique that i should apply?

It's difficult to say without seeing more of your code. My engine is developed in C# using bitboards. It searches the above position to depth 9 in 573ms, but it reduces and prunes heavily.

Code: Select all

go depth 9
info depth 1 seldepth 7 time 41 nodes 329 score cp 274 nps 8014 pv e2a6
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 2 seldepth 10 time 45 nodes 2701 score cp 179 nps 60071 pv e2a6 b4c3
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 3 seldepth 15 time 50 nodes 7532 score cp 191 nps 149916 pv e2a6 b4c3 d2c3
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 4 seldepth 15 time 59 nodes 16756 score cp 179 nps 283216 pv e2a6 b4c3 d2c3 e6d5
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 5 seldepth 16 time 97 nodes 90036 score cp 42 nps 929553 pv c3b5 f6d5 d2g5 e7g5 f3f7
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 6 seldepth 19 time 156 nodes 197933 score cp -52 nps 1268081 pv e2a6 e6d5 c3d5 f6d5 e5f7 e7f7
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 7 seldepth 18 time 253 nodes 379474 score cp -8 nps 1498143 pv e2a6 e6d5 c3b5 e7e5 d2f4 e5b2 b5c7
info hashfull 0 currmove f3e3 currmovenumber 48
info depth 8 seldepth 21 time 328 nodes 508946 score cp -8 nps 1551662 pv e2a6 e6d5 c3b5 e7e5 d2f4 e5b2 b5c7 e8f8
info hashfull 0 currmove e1c1 currmovenumber 48
info depth 9 seldepth 21 time 573 nodes 964299 score cp 4 nps 1681850 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 f2f4 e5d6 c3b5
info hashfull 1 currmove e1c1 currmovenumber 48
bestmove d5e6

The search speed ramps up to about 3 million NPS after 1.5 secs.

I found no improvement in complementing bitboards with a mailbox array for a GetPiece(int Square) method. If statements that mask piece bitboards with Board.SquareMasks[Square] until it finds one > 0 performs fine. Lack of a mailbox array for determining the piece on a given square definitely is not the root cause of your engine's slow search.

Most likely the root problem is a) memory allocations your engine does in search but not in perft or b) your search algorithm. Don't allocate any memory on the heap during search. Pre-allocate all data structures at engine startup. Could be bad move ordering. How does your engine's search metrics compare to MadChess' metrics? I suggest you measure these metrics. Enable them when the "debug on" UCI command is sent.

Code: Select all

info depth 9 seldepth 21 time 576 nodes 964299 score cp 4 nps 1672678 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 f2f4 e5d6 c3b5
info hashfull 1 currmove e1c1 currmovenumber 48
info string Cache Hit = 20.71% Score Cutoff = 21.16% Best Move Hit = 26.32% Invalid Best Moves = 0
info string Null Move Cutoffs = 51.40% Beta Cutoff Move Number = 1.26 Beta Cutoff First Move = 90.68%
info string Evals = 171,825
info string Stopping search at 577 milliseconds.
bestmove d5e6

pedrojdm2021 · Post by **pedrojdm2021** » Thu Jul 01, 2021 6:01 am

emadsen wrote: ↑Thu Jul 01, 2021 1:27 am
pedrojdm2021 wrote: ↑Wed Jun 30, 2021 6:49 am For example in Kiwipete position: (r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -) it does a search of depth 9 with this results:

Nodes searched: 1286811
Search time: 12596ms ( ~13 seconds ) in a AMD FX 8350 @4ghz CPU

so it means that my engine AI it's searching ~100k positions per second, i think it is pretty bad speed
so, anyone have any ideas to improve my nps? or another pruning technique that i should apply?
It's difficult to say without seeing more of your code. My engine is developed in C# using bitboards. It searches the above position to depth 9 in 573ms, but it reduces and prunes heavily.
Code: Select all
go depth 9
info depth 1 seldepth 7 time 41 nodes 329 score cp 274 nps 8014 pv e2a6
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 2 seldepth 10 time 45 nodes 2701 score cp 179 nps 60071 pv e2a6 b4c3
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 3 seldepth 15 time 50 nodes 7532 score cp 191 nps 149916 pv e2a6 b4c3 d2c3
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 4 seldepth 15 time 59 nodes 16756 score cp 179 nps 283216 pv e2a6 b4c3 d2c3 e6d5
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 5 seldepth 16 time 97 nodes 90036 score cp 42 nps 929553 pv c3b5 f6d5 d2g5 e7g5 f3f7
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 6 seldepth 19 time 156 nodes 197933 score cp -52 nps 1268081 pv e2a6 e6d5 c3d5 f6d5 e5f7 e7f7
info hashfull 0 currmove e5c6 currmovenumber 48
info depth 7 seldepth 18 time 253 nodes 379474 score cp -8 nps 1498143 pv e2a6 e6d5 c3b5 e7e5 d2f4 e5b2 b5c7
info hashfull 0 currmove f3e3 currmovenumber 48
info depth 8 seldepth 21 time 328 nodes 508946 score cp -8 nps 1551662 pv e2a6 e6d5 c3b5 e7e5 d2f4 e5b2 b5c7 e8f8
info hashfull 0 currmove e1c1 currmovenumber 48
info depth 9 seldepth 21 time 573 nodes 964299 score cp 4 nps 1681850 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 f2f4 e5d6 c3b5
info hashfull 1 currmove e1c1 currmovenumber 48
bestmove d5e6
The search speed ramps up to about 3 million NPS after 1.5 secs.

I found no improvement in complementing bitboards with a mailbox array for a GetPiece(int Square) method. If statements that mask piece bitboards with Board.SquareMasks[Square] until it finds one > 0 performs fine. Lack of a mailbox array for determining the piece on a given square definitely is not the root cause of your engine's slow search.

Most likely the root problem is a) memory allocations your engine does in search but not in perft or b) your search algorithm. Don't allocate any memory on the heap during search. Pre-allocate all data structures at engine startup. Could be bad move ordering. How does your engine's search metrics compare to MadChess' metrics? I suggest you measure these metrics. Enable them when the "debug on" UCI command is sent.
Code: Select all
info depth 9 seldepth 21 time 576 nodes 964299 score cp 4 nps 1672678 pv d5e6 e7e6 e2a6 h3g2 f3g2 e6e5 f2f4 e5d6 c3b5
info hashfull 1 currmove e1c1 currmovenumber 48
info string Cache Hit = 20.71% Score Cutoff = 21.16% Best Move Hit = 26.32% Invalid Best Moves = 0
info string Null Move Cutoffs = 51.40% Beta Cutoff Move Number = 1.26 Beta Cutoff First Move = 90.68%
info string Evals = 171,825
info string Stopping search at 577 milliseconds.
bestmove d5e6

I just implemented that mailbox for getting the piece in square, and i gained about ~100ms in performance in the same perft as in the post, at least it's something

The thing is that my engine is not a uci engine, it's meant to play inside the unity application that i'm developing. so there is 0 UCI engine features, at least with communitcation with the console, but i do have a console application project for testing and debugging the engine itself, so i have a little command list to do search and so on..

a) memory allocations your engine does in search

i don't know if i am doing that.... but visual studio in the profiler it points to I/O operations to be the cause of the issue...

for example my negamax function looks like this (is quite long, but it's because of it's documentation and features added into it)

Code: Select all

        // Fail hard alpha-beta negamax implementation
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private int Negamax(int _alpha, int _beta, int _depth, bool _makeNull = true)
        {
            // check if the search is cancelled every 4096 nodes
            if ((searched_nodes & 4096) == 0 && AbortSearch)
            {
                return 0;
            }

            // -------- [ Transposition table lookup ] ----------------------------

            TTEntry entry = TTable.Lookup(board.hash);
            if (entry.hash == board.hash && entry.depth >= _depth)
            {
                if (entry.eval < -MATE_SCORE) entry.eval += ply;
                if (entry.eval > MATE_SCORE) entry.eval -= ply;

                if (entry.flag == TTFlag.EXACT) 
                {
                    return entry.eval;
                }
                else if (entry.flag == TTFlag.ALPHA && entry.eval <= _alpha) 
                    return _alpha;
                else if (entry.flag == TTFlag.BETA && entry.eval >= _beta)
                    return _beta;
            }
            TTFlag flag = TTFlag.ALPHA;
            
            // ----------------------------------------------------------------------

            // Quiescense search
            if (_depth <= 0)
            {
                return Quiescense(_alpha,_beta);
            }

            // increment node searched count in the current search (useful for debugging)
            searched_nodes++;

            // if we have found a repeated position during the current search path, avoid it
            if (ply > 0 && _makeNull && IsRepetition())
            {
                return 0;
            }

            // is the player to move in check?
            bool in_check = board.InCheck(board.sideToMove);

            // -------------- [ Null move pruning  ] ------------------------------

            if (_makeNull && !in_check && _depth >= 3 && ply > 0)
            {
                MakeNullMove();
                int val = -Negamax(-_beta, -_beta + 1, _depth -3, !_makeNull);
                UnMakeNullMove();

                if (val >= _beta) 
                    return _beta;
            }

            // -------------- [ Futility Pruning Initialization  ] ------------------

            // Futility Pruning flag
            bool f_prune = false;
            // Main condition to consider Futility Pruning
            if (!in_check && _makeNull && Math.Abs(_alpha) < MATE_SCORE && _depth == 2)
            {
                f_prune = (Evaluate() + AISettings.mg_material_score[ChessPiece.Rook]) < _alpha;
            }

            // -----------------------------------------------------------------------

            // have we found our principal variation? (best move / path)
            bool found_pv = false;
            // cache score
            int score = 0;
            // how many legal moves we have so far
            byte legal_moves = 0;
            // how many moves have we searched so far
            byte moves_searched = 0;

            // movements & move index in the current node
            List<int> moves = board.GenerateMoves();
            byte moveIndex = 0;

            // -------------- [ Search iteration ] ------------------------------------

            for(moveIndex = 0; moveIndex < moves.Count; moveIndex++)
            {
                OrderMoves(moveIndex,ref moves);

                // -------------- [ Futility Pruning ] --------------------------------

                if (f_prune && moves_searched > 2 && MoveUtility.Get_flag_capture(moves[moveIndex]) == 0 
                    && MoveUtility.Get_flag_castling(moves[moveIndex]) == 0 && MoveUtility.Get_promotion(moves[moveIndex]) == 0)
                    {
                        if(board.MakeMove(moves[moveIndex]))
                        {
                            if (!board.InCheck(board.sideToMove))
                            {
                                board.UnMakeMove(moves[moveIndex]);
                                return Evaluate();
                            }
                            board.UnMakeMove(moves[moveIndex]);
                        }
                    }
                
                if (board.MakeMove(moves[moveIndex]))
                {
                    ply++; 
                    repetition_index++;
                    repetitions_table[repetition_index] = board.hash;

                    // -------------- [ Principal Variation Search ] ------------------

                    if(found_pv)
                    {
                        // try searching with a fixed window
                        score = -Negamax(-_alpha -1, -_alpha, _depth -1);

                        // if move fails, do a normal search
                        if (score > _alpha && score < _beta)
                            score = -Negamax(-_beta, -_alpha, _depth -1);
                    }else 
                    {
                        // -------------- [ Late Move reductions (LMR) ] ------------------

                        if (moves_searched >= FULL_DEPTH_MOVES && _depth >= REDUCTION_LIMIT && 
                            // LMR validation
                            (MoveUtility.Get_flag_capture(moves[moveIndex]) == 0 && MoveUtility.Get_promotion(moves[moveIndex]) == 0 
                            && !in_check && _depth >= REDUCTION_LIMIT))
                             // fixed window search with a reduced depth   
                            score = -Negamax(-_alpha -1, -_alpha, _depth -2);
                        else 
                            // Hack to ensure that full-depth search is done.
                            score = _alpha + 1;
                        
                        if (score > _alpha)
                        {
                            // try searching with a fixed window
                            score = -Negamax(-_alpha -1, -_alpha, _depth -1);
                            // our fails outside the window, do a normal search
                            if (score > _alpha && score < _beta)
                                score = -Negamax(-_beta, -_alpha, _depth -1);
                        }

                    }

                    // ----------------------------------------------------------------------
                    
                    board.UnMakeMove(moves[moveIndex]);
                    // decrement ply & add legal moves
                    ply--;
                    repetition_index--;
                    legal_moves++;
                    moves_searched++;

                    // fail-hard beta cut-off
                    if (score >= _beta)
                    {
                        killer_moves[1][ply] = killer_moves[0][ply];
                        killer_moves[0][ply] = moves[moveIndex]; 
                        TTable.Store(new TTEntry(TTFlag.BETA, board.hash, _depth , _beta), ply);
                        return _beta;
                    }

                    // found a better move
                    if (score > _alpha)
                    {
                        history_moves[board.sideToMove][MoveUtility.Get_square_from(moves[moveIndex])]
                                                       [MoveUtility.Get_square_to(moves[moveIndex])] += _depth;
                        _alpha = score;
                        pv[ply] = moves[moveIndex];
                        flag = TTFlag.EXACT;
                        found_pv = true;
                    }

                    // -------------- [ Mate Distance Pruning ] ------------------

                    // we already found a checkmate, no need to search more
                    if(_alpha >= MATE_SCORE)
                    {
                        return _alpha;
                    }

                    // we can not be checkmated yet, so its's safe to prune here
                    if (_beta <= (-MATE_VALUE + 2) && !in_check)
                    {
                        return _beta;
                    }

                }
            }
            
            // -------------- [ Finish search ] -------------------------------

            // Legality check
            if (legal_moves == 0)
            {
                if (in_check)
                {
                    // retun mate score (closest distance to mate)
                    return -MATE_VALUE + ply;
                }else 
                {
                    // return stalemate (draw) score
                    return 0;
                }
            }

            TTable.Store(new TTEntry(flag, board.hash, _depth, _alpha), ply);
            // node fails low
            return _alpha;
        }
    }

In there you can see that i am using some local variables for some stuff... like the TTEntry (struct) for transposition tables, booleans for check detection, and so on... each node i get a new list<int> instance from the moveGenerator, things like that, i don't know if that is what you mean with "allocate memory on the heap"

i have something similar for QS and i do use local variables for move ordeing too, i tried to avoid that by moving some of the variables to global variables inside the class for example some variables used in the evaluation (like some dummy bitboard to loop over the pieces), but it broke the search and the engine started giving another moves in some depth levels, so actually i don't know clearly how to avoid declaration of that stuff inside the methods that they are in use

and also i have tried to "reduce" the bit-count from int to encode the moves to ushort to encoding the moves and adding an enum to flag features, so instead of encoding the move in 19 bits, only encoding it in 16 bits, but it started generating issues with the search, and instead of gaining performance in the perft, it did the opposite

so i reverted back these changes and for now i am again with int dataType for movements

....maybe the issue is the move generator?

for generating moves i have something like this:

Code: Select all

 [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public List<int> GenerateMoves()
        {
            moves = new List<int>(64);
            ulong own_pieces = occupancies[sideToMove];
            // Generate king moves
            Append_moves_king(own_pieces);
            // we have double check here. so we can return early
            if (BitUtility.CountBits(GetKingAttackers(sideToMove)) >= 2) return moves;
            // Generate pawn moves
            Append_moves_pawn(own_pieces);
            // Generate knight moves
            Append_moves_kinght(own_pieces);
            // Generate bishop moves
            Append_moves_sliding(own_pieces, ChessPiece.Bishop);
            // Generate rook moves
            Append_moves_sliding(own_pieces, ChessPiece.Rook);
            // Generate queen moves
            Append_moves_sliding(own_pieces, ChessPiece.Queen);
            return moves;
        }

and for each piece is something like this, with variations based on the piece type (this one is for kinghts):

Code: Select all

[MethodImpl(MethodImplOptions.AggressiveInlining)]
        private void Append_moves_kinght(ulong _own_pieces)
        {
            ulong knight_moves = 0ul;
            ulong pieces = bitboards[sideToMove][ChessPiece.Knight];
            byte square_from;
            byte square_to;
            byte capture;
            while (pieces > 0)
            {
                square_from = BitUtility.GetLS1BIndex(pieces);
                knight_moves = ChessData.GetKnightMoves(square_from) & ~ _own_pieces;
                while (knight_moves > 0)
                {
                    square_to = BitUtility.GetLS1BIndex(knight_moves);
                    capture = (byte)(BitUtility.ContainsBit(occupancies[sideToMove ^ 1], square_to) ? 1 : 0);
                    moves.Add(MoveUtility.Encode_move(square_from, square_to , 0, capture,0,0));
                    BitUtility.RemoveLS1B(ref knight_moves);
                }
                BitUtility.RemoveLS1B(ref pieces);
            }
        }

hgm · Post by **hgm** » Thu Jul 01, 2021 10:28 am

This is suspect:

Code: Select all

            if ((searched_nodes & 4096) == 0 && AbortSearch)

Are you sure you don't mean 4095 here?

mvanthoor · Post by **mvanthoor** » Thu Jul 01, 2021 10:32 am

pedrojdm2021 wrote: ↑Thu Jul 01, 2021 6:01 am ....maybe the issue is the move generator?

If you run the program through a profiler, you don't have to guess. The profiler will tell you where the slowest parts of you code are, so you can either fix or rewrite them.

Karlo Bala · Post by **Karlo Bala** » Thu Jul 01, 2021 2:17 pm

pedrojdm2021 wrote: ↑Wed Jun 30, 2021 5:11 pm
So.. do you think having an array of pieces in board will be better? right?

I think it is a good idea to add a mailbox board if you already have a working bitboard engine.
It will cost you less than an hour of work and will simplify your code in many places. There is nothing worse than bloated code without a particular reason.

No4b · Post by **No4b** » Thu Jul 01, 2021 4:18 pm

pedrojdm2021 wrote: ↑Wed Jun 30, 2021 6:49 am Hello, i am really happy with the results that i'm getting so far, but the speed is still not that great

For example in Kiwipete position:
[d]r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -

it does a search of depth 9 with this results:
Nodes searched: 1286811
Search time: 12596ms ( ~13 seconds ) in a AMD FX 8350 @4ghz CPU
so it means that my engine AI it's searching ~100k positions per second, i think it is pretty bad speed

Features that i have so far inside the search:
- Transposition Tables (~20mb hash table of 875000 entries)
- Bitboards with magic bitboard move generator
- Null move pruning
- PVS (principal variation search)
- LMR (Late move reductions)
- Extended Futility Pruning

The engine is developed in C#, because i am making a chess application in unity engine, so i have to keep that language at least for now, maybe in a future it can be a good idea develop a the engine in a .dll plugin and link it into the C# layer, but that is not the case right now.

In the other hand i have a pretty "good" speed in the perft results:
for example in the same Kiwipete position in depth 4 it does the 4085603 nodes in only ~700ms, so clearly the issue comes from the AI search itself.

i have executed the visual studio debugger and it points to this function that have the most heavy cpu use:
Code: Select all
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private int GuessMoveScore(int _move)
        {   
            int score = 0;

            if (MoveUtility.Get_flag_capture(_move) > 0)
            {   
                // apply lookup to the MVV/LVA Table
                byte attacker = board.GetPieceTypeInSquare((byte)MoveUtility.Get_square_from(_move));
                byte victim = board.GetPieceTypeInSquare((byte)MoveUtility.Get_square_to(_move), true);
                score += AISettings.mvv_lva[attacker][victim] + 10000;
            }else 
            {
                // score first killer move
                if (killer_moves[0][ply] == _move)
                    score += 9000;
                // score second killer move
                else if (killer_moves[1][ply] == _move)
                    score += 8000;
                else 
                // score history move   
                    score += history_moves[board.sideToMove][MoveUtility.Get_square_from(_move)][MoveUtility.Get_square_to(_move)];
            }

            // give a little bonus if this is a promotion
            if (MoveUtility.Get_promotion(_move) > 0)
            {
                score += 325;
            }

            return score;
        }
you can see is a pretty simple scoring for MVV/LVA move ordering, the "only" problem is that the moved piece in a move is not begin encoded into the int move, instead i need to search for the piece from the board using the square index.

the search piece in the board function looks like this:
Code: Select all
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public byte GetPieceTypeInSquare(byte _square , bool _adversary = false)
        {
            int player = _adversary ? sideToMove ^ 1 : sideToMove;
            if((1 & (bitboards[player][ChessPiece.Pawn] >> _square)) > 0) return ChessPiece.Pawn; 
            if((1 & (bitboards[player][ChessPiece.Bishop] >> _square)) > 0) return ChessPiece.Bishop; 
            if((1 & (bitboards[player][ChessPiece.Rook] >> _square)) > 0) return ChessPiece.Rook; 
            if((1 & (bitboards[player][ChessPiece.Knight] >> _square)) > 0) return ChessPiece.Knight; 
            if((1 & (bitboards[player][ChessPiece.Queen] >> _square)) > 0) return ChessPiece.Queen; 
            if((1 & (bitboards[player][ChessPiece.King] >> _square)) > 0) return ChessPiece.King; 
            return 0;
        }
in my point of view, i see it ok, i don't know if keeping an array of pieces in squares would be better, but it can slow the move generator as i have tested

the move ordering is done very similar as it's done by the VICE engine:
The move ordering is done inside the search loop in the negamnax, and it calls ' OrderMoves(Moveindex, ref move_list); '
and then that function calls the "GuessMoveScore" function to sort the moves on the list based on the current index.

so, anyone have any ideas to improve my nps? or another pruning technique that i should apply?

Cheers, Pedro.

My chess variant engine is basically use same setup - c# inside Unity,and have a bit of search features (TT + Nullmove).
It has INCREDIBLY inefficient board representation (defined as board [MAX_X][MAX_Y] - to support various board sizes), move generation ( loop each square, giant switch for ~50 different pieces) and evaluation (loop squares 1 by 1, evaluate each piece). Also my MakeMove is a mess to support different pieces abilities etc.
And it is usually reach ~150-200 knps in the chess (middlegame) and 200-300 in the endgame.

My assumption would be to look at:
1) MoveGen, MakeMove, TakeMove.
2) Try not to use ANY complex data structures that Unity provides (fe at first I used Vector3 to store Unit movements, replacement it with own struct of {int x, int y, int z} was like +50 knps.
3) be sure that no data provided is copied, when possible (f.e board class, if you have that).

Although the best way is to use a profiler, as suggested above, as slowdown can really be anywhere (eval, search ...), and/or strip engine from anything but movegen and optimize it first.

spirch · Post by **spirch** » Thu Jul 01, 2021 6:36 pm

this, remove this list

Code: Select all

public List<int> GenerateMoves()
        {
            moves = new List<int>(64);

anytime you do a

Code: Select all

= new

something, remove it or find a way to run it less often

suggestion, go get the 5 day trial of dotmemory, see the flat line? this should be your goal, you want the memory to be dead, not alive

this is a perft 5
https://i.imgur.com/GRyhtH7.png

Pio · Post by **Pio** » Thu Jul 01, 2021 7:42 pm

hgm wrote: ↑Thu Jul 01, 2021 10:28 am This is suspect:
Code: Select all
            if ((searched_nodes & 4096) == 0 && AbortSearch)
Are you sure you don't mean 4095 here?

I agree with hgm. This looks really suspicious. It looks like you for the first 4096 nodes don’t look at AbortSearch, then for the next 4096 nodes you will look at AbortSearch every time and then not look for the next 4096, then look for next 4096…. . On average you will look on AbortSearch in 50 % of the cases. If AbortSearch is a property and does a lot that will be problematic.

I guess you would want to check if searched_nodes % 4096 == 0

How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?

Re: How can i improve my C# engine speed?