I need help, performance wall!

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

CRoberson
Posts: 2056
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: I need help, performance wall!

Post by CRoberson »

aberent wrote:I am not using Principle Variation or Iterative Deepening, Just vanilla Alpha Beta.
Then there are two major performance problems. Progressive Deepening with searching the PV first and assuming you have
transposition tables will give a 2x speed up at least on Time to Ply.
CRoberson
Posts: 2056
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: I need help, performance wall!

Post by CRoberson »

Most PC's are running chips based on superscalar architectures. Such machines have performance problems with conditional branching (if statements). So, make your quiesce routine a completely different routine. That will get some speed and make your code much easier to debug.

Second, don't generate the next set of moves just before searching the move you just made.

Code: Select all

    Search()
    {
        if &#40;depth <= 0&#41;
            return&#40;Qsearch&#40;));
       
       test for 3 fold rep
       test for transposition table hit and return
       null move test and possible return
       moves = generate_moves&#40;)
       for ( each move&#41;
       &#123;
             make move
             value = -Search&#40;depth -1&#41;
             if &#40;value > alpha&#41;
             &#123;
                  alpha = value;
                 if &#40;value >= beta&#41;
                     return&#40;beta&#41;;
                  Store PV
             &#125;
      &#125;
      Store best move in transposition tables
      return&#40;alpha&#41;
    &#125;

   Qsearch&#40;)
   &#123;
        // stand Pat score
        v = PositionEvaluation&#40;);
        if &#40;v>alpha&#41;
          alpha = v;
        if &#40;alpha>=beta&#41;
            return&#40;beta&#41;;

         Generate Captures
         for ( each capturing move&#41;
         &#123;
              make move
              v = -Qsearch&#40;)
             if &#40;v>alpha&#41;
                  alpha =v;
             if &#40;v>beta&#41;
                 return&#40;beta&#41;;
         &#125;
        return&#40;alpha&#41;;
   &#125;
Notice after making a move you then call search (don't gen moves).

In the Qsearch, same concept about move gen.

Now, prove to me you understand that algorithm. How does the stand pat score get returned? Also, explain the efficiency in when I gen moves in the Qsearch vs how you do it. Why is what I typed better?

When you can answer those questions, you have a chance at improving your program.
aberent

Re: I need help, performance wall!

Post by aberent »

nevatre wrote:Adam, this doesn't look correct
"if (alpha > examineBoard.Score)
alpha = examineBoard.Score; "
Did I reverse the operator <>?
aberent

Re: I need help, performance wall!

Post by aberent »

Harald wrote:I don't know C# and your ode looks like some ideas shuffled and written down
but I will make some notes. May be I don't see your main problems. However.
aberent wrote:In case someone has the time, here is my AlphaBeta. This is the second version that makes one move at a time, maybe you guys can see a bug?

Code: Select all

internal static int AlphaBetaFast&#40;Board examineBoard, byte depth, int alpha, int beta, ChessPieceColor movingSide, ref int nodesSearched, bool quiescence, bool extended&#41;
//// May be you copy too much boards. Try to use only one copy per call
//// or use a reference to ONE updated board. make_move&#40;)/undo_move&#40;)
        &#123;
            nodesSearched++;

            if &#40;depth == 0&#41;
//// Use a second function for quiescence. Don't mix it with your main alpha-beta search.
            &#123;
                //If last move was a capture
                if &#40;examineBoard.LastMove.TakenPiece.PieceType != ChessPieceType.None && quiescence == false&#41;
                &#123;
                    //Perform a Quiessence Search
//// I don't understand the connection between last-move-capture and qsearch.
                    return AlphaBetaFast&#40;examineBoard, 6, alpha, beta, movingSide, ref nodesSearched, true, true&#41;;
//// Why is depth 6? Normally depth <= 0 is qsearch indicator and we use
//// depth = 0 or depth - 1 in the calls inside qsearch.
//// Again&#58; dont mix search and qsearch&#58; for better understanding.
                &#125;
                if &#40;extended == false&#41;
//// What is this extension?
                &#123;
                    if (&#40;examineBoard.BlackCheck || examineBoard.WhiteCheck&#41; && examineBoard.EndGamePhase&#41;
//// If you want to extend checks then only if you are in check
//// &#40;or only when you are giving check&#41; but not both
                    &#123;
                        return AlphaBetaFast&#40;examineBoard, 3, alpha, beta, movingSide, ref nodesSearched, false, true&#41;;
//// A typical extension would use depth + 1 and not 3.
                    &#125;
                &#125;

                //Evaluate Score
                Evaluation.EvaluateBoardScore&#40;examineBoard&#41;;
                //Invert Score to support Negamax
                examineBoard.Score = AbsoluteScore&#40;examineBoard.Score, GetOppositeColor&#40;movingSide&#41;);
//// AbsoluteScore is misleading. It sounds like abs&#40;). Perhaps you mean
//// sideToMoveDependentScore when your evaluate function always 
//// sees score > 0 as good for white?

                return examineBoard.Score;
            &#125;

            if &#40;examineBoard.FiftyMove >= 50 || examineBoard.RepeatedMove >= 3 || examineBoard.StaleMate&#41;
                return 0;
//// Perhaps move50 and repetition should be done first in every node.
//// How do you know about stalemate without building a move list?
//// Where do you find out that you are mated and where is the mating score?

            //Stand Pad
//// Stand pat.
            if &#40;quiescence&#41;
            &#123;
//// Again why quiescence flag instead of depth <= 0?
                //Evaluate Score
                Evaluation.EvaluateBoardScore&#40;examineBoard&#41;;
                //Invert Score to support Negamax
                examineBoard.Score = AbsoluteScore&#40;examineBoard.Score, GetOppositeColor&#40;movingSide&#41;);

                if &#40;examineBoard.Score >= beta&#41;
                    return beta;

                if &#40;alpha > examineBoard.Score&#41;
                    alpha = examineBoard.Score;
//// Where do you collect the best move so far?
            &#125;

            List<Position> positions = EvaluateMoves&#40;movingSide, examineBoard, quiescence&#41;;
//// Assuming this is the move generator.
//// Why a list of positions instead a list of moves?
            if &#40;positions.Count == 0&#41;
            &#123;
                //Evaluate Score
                Evaluation.EvaluateBoardScore&#40;examineBoard&#41;;
                //Invert Score to support Negamax
                examineBoard.Score = AbsoluteScore&#40;examineBoard.Score, GetOppositeColor&#40;movingSide&#41;);
//// See above.
                return examineBoard.Score;
//// Is this your place to recognise mates or stalemates?
            &#125;

            positions.Sort&#40;Sort&#41;;

            depth--;
//// The usual approach is not to change depth here but in the call below.

            foreach &#40;Position move in positions&#41;
            &#123;
                //Make a copy
                Board board = examineBoard.FastCopy&#40;);
//// Is this the only board copy or just another one?
//// See comment above.

                //Move Piece
                Board.MovePiece&#40;board, move.SrcPosition, move.DstPosition&#41;;
//// Once you do promotion you will need a promotion piece also.

                //We Generate Valid Moves for Board
                PieceValidMoves.GenerateValidMoves&#40;board&#41;;
//// Ups! Why? This is not the typical place for move generation.
//// See comment on move generator above.
                
                int value = -AlphaBetaFast&#40;board, depth, -beta, -alpha, GetOppositeColor&#40;movingSide&#41;, ref nodesSearched,
                                       quiescence, extended&#41;;
//// I would expect depth-1 in the call.

                if &#40;value >= beta&#41;
                &#123;
                    return beta;
                &#125;
                if &#40;value > alpha&#41;
                    alpha = value;
//// Keep track on the best move.
            &#125;
//// Here would be a good place to recognise mate.
//// &#40;No best move found and in check&#41;.
//// Here you can store something in a hash table when you implement one later.
            return alpha;
        &#125;
I am sure I found only some obvious problems. When I guessed wrong
that is an indication for a strange way to do something in your code.
Please try to react to some of my comments and in an improved version
we may find more and more serious errors. Have fun. :-)

Harald

Thank you Harald for giving me your comments,

Here are some answers to your questions:

I will sperate the quiescence search as you suggested. Also I only call quiescence when the last move was a capture. Is that wrong? Should I always be ending alpha beta with quiescence. Also I limit quiescence to 6 ply, is that bad, should I just let it go until there are no more captures?

I will also remove the extension for now to keep things easier.

My Eval Function sees >0 As good for White.

The Absolute Score Method that reverses the score looks like this:

Code: Select all

private static int AbsoluteScore&#40;int Score, ChessPieceColor color&#41;
        &#123;
            if &#40;color == ChessPieceColor.White&#41;
                return -Score;

            return Score;
        &#125;
There is no mate logic yet. I am not sure how to implement it without examining all of the moves ahead of time, which we don't want as far as I understand?

The best move so far is handled outside of this method.

I only copy the board once in:

Code: Select all

Board board = examineBoard.FastCopy&#40;); 
The following code generates all valid moves for all pieces after the move is made. Is that bad?

Code: Select all

PieceValidMoves.GenerateValidMoves&#40;board&#41;;
aberent

Re: I need help, performance wall!

Post by aberent »

hgm wrote:
aberent wrote:Ok this sounds promising but I am not sure what you mean. What do you mean by "search the PV first"
If at the 3-ply iteration you find the PV = e2e4 - e7e5 - g1f3, search for the 4-ply iteration first the branch that starts with e2e4 - e7e5 - g1f3. I.e. at ply=0 (root) search e2e4 first, then, in the ply=1 node after e2e4 search e7e5 first, etc. At ply=3 you are beyond the end of your old PV, so you try the best capture first, or the moves in random order. (There is only one more ply to go, and the window is still fully open, so this is good enough.)
Thats a great idea, I just learned something new, thanks
aberent

Re: I need help, performance wall!

Post by aberent »

CRoberson wrote:Most PC's are running chips based on superscalar architectures. Such machines have performance problems with conditional branching (if statements). So, make your quiesce routine a completely different routine. That will get some speed and make your code much easier to debug.

Second, don't generate the next set of moves just before searching the move you just made.

Code: Select all

    Search&#40;)
    &#123;
        if &#40;depth <= 0&#41;
            return&#40;Qsearch&#40;));
       
       test for 3 fold rep
       test for transposition table hit and return
       null move test and possible return
       moves = generate_moves&#40;)
       for ( each move&#41;
       &#123;
             make move
             value = -Search&#40;depth -1&#41;
             if &#40;value > alpha&#41;
             &#123;
                  alpha = value;
                 if &#40;value >= beta&#41;
                     return&#40;beta&#41;;
                  Store PV
             &#125;
      &#125;
      Store best move in transposition tables
      return&#40;alpha&#41;
    &#125;

   Qsearch&#40;)
   &#123;
        // stand Pat score
        v = PositionEvaluation&#40;);
        if &#40;v>alpha&#41;
          alpha = v;
        if &#40;alpha>=beta&#41;
            return&#40;beta&#41;;

         Generate Captures
         for ( each capturing move&#41;
         &#123;
              make move
              v = -Qsearch&#40;)
             if &#40;v>alpha&#41;
                  alpha =v;
             if &#40;v>beta&#41;
                 return&#40;beta&#41;;
         &#125;
        return&#40;alpha&#41;;
   &#125;
Notice after making a move you then call search (don't gen moves).

In the Qsearch, same concept about move gen.

Now, prove to me you understand that algorithm. How does the stand pat score get returned? Also, explain the efficiency in when I gen moves in the Qsearch vs how you do it. Why is what I typed better?

When you can answer those questions, you have a chance at improving your program.
Ok so I seperated the Qsearch from regular Alpha Beta. That gave me a small improvement 2%. Better than nothing :)

I think I get it (maybe) you don't call Generate Moves before you enter a deeper depth because you might be at depth 0 in the next move? In that case you just evaluate the position and exit without having to Generate Moves?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: I need help, performance wall!

Post by bob »

aberent wrote:
hgm wrote:
aberent wrote:I am not using Principle Variation or Iterative Deepening, Just vanilla Alpha Beta.
You can still use vanilla Alpha Beta and search the PV first. Doing depth-first without having any idea what the PV is is quite suicidal, and probably fully explains all your problems.
Ok this sounds promising but I am not sure what you mean. What do you mean by "search the PV first"
Iterative deepening. You do a 1 ply search, then a 2 ply search, but on the 2 ply search you search the best move from the one ply search first. Then you do a 3 ply search, and search the best moves at ply=1 and 2 from the two ply search first...
tvrzsky
Posts: 128
Joined: Sat Sep 23, 2006 7:10 pm
Location: Prague

Re: I need help, performance wall!

Post by tvrzsky »

Well, please note that I know almost nothing about C# and its memory management but I am teriffied by the amount of new keywords in the code. If this means memory allocation gets executed dozens of times in every node then I am afraid it could be cause of huge slowdown of your search. Is not there any way to avoid this?
Second thing I have noticed is that you are traversing the whole board array in several places only searching for exisiting pieces. Consider implementation of piece list which could save this useless work. Also I doubt usefullness of precomputed moves for sliding pieces. I think that to compute the next square when sliding on the fly should be more efficient. And do not use two-dimensional array for the board.
CRoberson
Posts: 2056
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: I need help, performance wall!

Post by CRoberson »

aberent wrote:
CRoberson wrote:Most PC's are running chips based on superscalar architectures. Such machines have performance problems with conditional branching (if statements). So, make your quiesce routine a completely different routine. That will get some speed and make your code much easier to debug.

Second, don't generate the next set of moves just before searching the move you just made.

.......

Notice after making a move you then call search (don't gen moves).

In the Qsearch, same concept about move gen.

Now, prove to me you understand that algorithm. How does the stand pat score get returned? Also, explain the efficiency in when I gen moves in the Qsearch vs how you do it. Why is what I typed better?

When you can answer those questions, you have a chance at improving your program.
Ok so I seperated the Qsearch from regular Alpha Beta. That gave me a small improvement 2%. Better than nothing :)

I think I get it (maybe) you don't call Generate Moves before you enter a deeper depth because you might be at depth 0 in the next move? In that case you just evaluate the position and exit without having to Generate Moves?
No, you don't get it fully. There is much more to it than that. But, lets go with what you have figured out. Now, what is the impact of
not generating moves on the last ply? Think about the branch factor. How many nodes are in the last ply relative to the rest of the tree? Also, how many nodes are in the last ply relative to the previous ply?
Dann Corbit
Posts: 12541
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: I need help, performance wall!

Post by Dann Corbit »

CRoberson wrote:
aberent wrote:
CRoberson wrote:Most PC's are running chips based on superscalar architectures. Such machines have performance problems with conditional branching (if statements). So, make your quiesce routine a completely different routine. That will get some speed and make your code much easier to debug.

Second, don't generate the next set of moves just before searching the move you just made.

.......

Notice after making a move you then call search (don't gen moves).

In the Qsearch, same concept about move gen.

Now, prove to me you understand that algorithm. How does the stand pat score get returned? Also, explain the efficiency in when I gen moves in the Qsearch vs how you do it. Why is what I typed better?

When you can answer those questions, you have a chance at improving your program.
Ok so I seperated the Qsearch from regular Alpha Beta. That gave me a small improvement 2%. Better than nothing :)

I think I get it (maybe) you don't call Generate Moves before you enter a deeper depth because you might be at depth 0 in the next move? In that case you just evaluate the position and exit without having to Generate Moves?
No, you don't get it fully. There is much more to it than that. But, lets go with what you have figured out. Now, what is the impact of
not generating moves on the last ply? Think about the branch factor. How many nodes are in the last ply relative to the rest of the tree? Also, how many nodes are in the last ply relative to the previous ply?
Secondary hint:
This is also why a profile shows gobs of {unnecessary} time in movegen.