reducing the number of evaluated positions in quiescence sea

flok · Post by **flok** » Fri Dec 25, 2015 11:08 am

Hi,

My program searches at depth 6 around 120k nodes in quiescence search (and 95k regular nodes).
That's a bit much (this has been noted here before).
It was suggested to, apart from only checking promotions and catch-moves, to only check the catch moves that give material gain.
So what I now do is:

Code: Select all

int qs(alpha, beta, etc) {
   int bestVal = evaluate();

   int minimumScoreGain = alpha - bestVal;

   foreach(moves) {
       if (victim.value > minimumScoreGain || promotion) {
          // call qs etc
       }
   }
}

...where victim.value is the score-value of a piece (e.g. pawn 100).

For depth 7 this code searches around 2 million nodes where the original code (checking all catch-moves) also searches around 2 million nodes.

What is the bug here?

ZirconiumX · Post by **ZirconiumX** » Fri Dec 25, 2015 12:03 pm

flok wrote:Hi,

My program searches at depth 6 around 120k nodes in quiescence search (and 95k regular nodes).
That's a bit much (this has been noted here before).
It was suggested to, apart from only checking promotions and catch-moves, to only check the catch moves that give material gain.
So what I now do is:
Code: Select all
int qs(alpha, beta, etc) {
   int bestVal = evaluate();

   int minimumScoreGain = alpha - bestVal;

   foreach(moves) {
       if (victim.value > minimumScoreGain || promotion) {
          // call qs etc
       }
   }
}
...where victim.value is the score-value of a piece (e.g. pawn 100).

For depth 7 this code searches around 2 million nodes where the original code (checking all catch-moves) also searches around 2 million nodes.

What is the bug here?

Are you doing QS stand pat? Not doing captures with SEE < 0?

Also, this code resembles delta pruning, so look at the forum posts for more information.

Matthew:out

Sven · Post by **Sven** » Fri Dec 25, 2015 2:23 pm

flok wrote:Hi,

My program searches at depth 6 around 120k nodes in quiescence search (and 95k regular nodes).
That's a bit much (this has been noted here before).
It was suggested to, apart from only checking promotions and catch-moves, to only check the catch moves that give material gain.
So what I now do is:
Code: Select all
int qs(alpha, beta, etc) {
   int bestVal = evaluate();

   int minimumScoreGain = alpha - bestVal;

   foreach(moves) {
       if (victim.value > minimumScoreGain || promotion) {
          // call qs etc
       }
   }
}
...where victim.value is the score-value of a piece (e.g. pawn 100).

For depth 7 this code searches around 2 million nodes where the original code (checking all catch-moves) also searches around 2 million nodes.

What is the bug here?

As Matthew already pointed out, I think one of the bugs is that you simply do not use the stand-pat principle, which you will usually need for QS to be efficient. Also the alpha-beta window should be narrowed afterwards:

Code: Select all

int qs(alpha, beta, etc) {
   int bestVal = evaluate();
   if (bestVal >= beta) return beta; // assuming fail-hard AB
   if (bestVal > alpha) alpha = bestVal;
   ...
}

Second issue would be your "gain" calculation. There you are missing the typical "safety margin", for instance a margin of 200 centipawns, resulting from possibly improving the positional score with the capture (e.g. by removing a passed pawn or a dangerous piece attacking your king). But that should not be responsible for searching too many nodes, in fact you would skip more captures than necessary. So I guess the first point above might be the one that raises your node count.

One question about your node counting: which nodes do you count as QS nodes? You wrote "120k nodes in quiescence search (and 95k regular nodes)". Typical numbers are 80% (or a bit more) for the percentage of QS nodes among all nodes, where leaf nodes of full-width search (depth=0) would count as QS nodes since you start doing QS there.

flok · Post by **flok** » Fri Dec 25, 2015 9:56 pm

ZirconiumX wrote:Are you doing QS stand pat?

Yes!

Not doing captures with SEE < 0?

I have not implemented SEE so no.

Also, this code resembles delta pruning, so look at the forum posts for more information.

Matthew:out

First implementation does not make any difference. I'll post all the code in an other reply.

flok · Post by **flok** » Fri Dec 25, 2015 10:10 pm

Sven Schüle wrote:As Matthew already pointed out, I think one of the bugs is that you simply do not use the stand-pat principle, which you will usually need for QS to be efficient.

I do actually! You may recognize this code a bit from an earlier chess program of mine which was written in Java (this is c++):

Code: Select all

int Brain::quiesceSearch(search_data_t *const sd, const int currentSearchDistance, const PlayerColor c, int alpha, int beta, const zobrist_t & zh, std::atomic_bool *const stopFlag, const Move & siblingPrevBestMove, Move * const selMove, MoveList *const pv)
{
        sd -> cs -> qcnt++;

        *selMove = invalidMove;

        const int ttDepth = depth_sanity_limit - currentSearchDistance;

        int alphaOrig = alpha;

        Move ttMove = invalidMove;
        int temp = 0;
        if (evalTT(tt, ttDepth, zh.newHash, &alpha, &beta, &temp, &ttMove))
        {
                sd -> cs -> qtpt_hit_cnt++;
                return temp;
        }
        else if (evalTT(qtt, ttDepth, zh.newHash, &alpha, &beta, &temp, &ttMove))
        {
                sd -> cs -> qqtpt_hit_cnt++;
                return temp;
        }

        sd -> s -> generateMovelists(c, true);

        bool abort = false;
        int bestVal = evaluate(sd -> s, c, currentSearchDistance, &abort);
        if (bestVal > alpha)
        {
                alpha = bestVal;

                if (bestVal >= beta)
                        return bestVal;
        }

        if (abort || currentSearchDistance > MAX_QUIESCENCE_DEPTH)
        {
                if (!abort)
                        sd -> cs -> q_limit++;

                return bestVal;
        }
        MoveList *const moves = sd -> s -> getMoveList(c);
        const size_t nMoves = moves -> size();

        const Board *const b = sd -> s -> getBoard();

        std::vector<Move> putInFront;

        Move pvMove = invalidMove;
        if (sd -> oldPv -> size() >= currentSearchDistance)
        {
                const size_t nr = sd -> oldPv -> size() - currentSearchDistance;
                pvMove = sd -> oldPv -> atReference(nr);
                putInFront.push_back(pvMove);
        }

        if (isValidMove(ttMove) && moveCompareTo(ttMove, pvMove) == false)
                putInFront.push_back(ttMove);

        if (isValidMove(siblingPrevBestMove) && moveCompareTo(ttMove, siblingPrevBestMove) == false && moveCompareTo(pvMove, siblingPrevBestMove) == false)
                putInFront.push_back(siblingPrevBestMove);

        size_t sortStartOffset = reorderAndSortList(moves, putInFront);

        moves -> sort(moveSortVictimType, (void *)b, sortStartOffset);

        const int newSearchDistance = currentSearchDistance + 1;

        const PlayerColor co = opponentColor(c);

        Move curBestPrevMove = invalidMove, qsSelMove = invalidMove, mDummy = invalidMove;

        MoveList newPv, tempPv;

        const int minimumScoreGain = alpha - bestVal + ChessPiece::evalVal[PAWN];

        for(size_t idx=0; idx<nMoves && !*stopFlag; idx++)
        {
                const Move & m = moves -> at(idx);

                // only promotions to queen and knight are in the movelist
                if (m.materialPoints > minimumScoreGain || m.promoteTo != NONE)
                {
                        const Move *epMove = isEpMove(b, m) ? &m : NULL;

                        const BoardState bs = sd -> s -> doMove(c, m);
                        zobrist_t newHash = updateZobrist(zh, bs, b, co, epMove);

                        int score = 0;
#ifdef USE_TT
                        if (!addToHistory(sd -> history, newHash.newHash))
#endif
                        {

                                score = -quiesceSearch(sd, newSearchDistance, co, -beta, -alpha, newHash, stopFlag, curBestPrevMove, &mDummy, &tempPv);
                        }

                        removeFromToHistory(sd -> history, newHash.newHash);

                        sd -> s -> undoMove();

                        if (score > bestVal)
                        {
                                bestVal = score;
                                curBestPrevMove = qsSelMove;
                                *selMove = m;

                                if (score > alpha)
                                {
                                        alpha = score;

                                        newPv.assign(tempPv, false);

                                        if (score >= beta)
                                                break;
                                }
                        }
                }
        }

        if (isValidMove(*selMove))
        {
                if (alpha > alphaOrig && !*stopFlag)
                        updateTT(qtt, zh.newHash, -ttDepth, alphaOrig, beta, bestVal, *selMove);

                newPv.push_back(*selMove);
                pv -> assign(newPv, false);
        }

        return bestVal;
}

Second issue would be your "gain" calculation. There you are missing the typical "safety margin", for instance a margin of 200 centipawns, resulting from possibly improving the positional score with the capture (e.g. by removing a passed pawn or a dangerous piece attacking your king).

Ok not 200 centipawns but 1 pawn in my case.

But that should not be responsible for searching too many nodes, in fact you would skip more captures than necessary. One question about your node counting: which nodes do you count as QS nodes? You wrote "120k nodes in quiescence search (and 95k regular nodes)". Typical numbers are 80% (or a bit more) for the percentage of QS nodes among all nodes, where leaf nodes of full-width search (depth=0) would count as QS nodes since you start doing QS there.

QS nodes are all the ones in my ::quiesceSearch method. I had indeed a bug in search() (the regular search) which also counted the node for regular-search while in reality it would jump straight to qs.
So now it is (one test run, depth 7):
qs: 844398, regular: 129128

Sven · Post by **Sven** » Fri Dec 25, 2015 10:46 pm

flok wrote:
Sven Schüle wrote:As Matthew already pointed out, I think one of the bugs is that you simply do not use the stand-pat principle, which you will usually need for QS to be efficient.
I do actually! You may recognize this code a bit from an earlier chess program of mine which was written in Java (this is c++):
Code: Select all
int Brain::quiesceSearch(search_data_t *const sd, const int currentSearchDistance, const PlayerColor c, int alpha, int beta, const zobrist_t & zh, std::atomic_bool *const stopFlag, const Move & siblingPrevBestMove, Move * const selMove, MoveList *const pv)
{
...
}

Yes, I remember! After getting this new information I suspect a move ordering problem. You put something special to the front of the move list. What is that "siblingPrevBestMove"? At a first glance it does not look obvious to me, and its handling looks a bit suspicious. You pass it as an input parameter so you seem to know in advance "somehow" that it might be a good move that should be sorted to the top of the movelist, but how do you know? The variable name suggests that this move has been the best reply (or a cutoff reply) at a sibling of the current node. But is that really a good idea in qsearch? Would you consider to check whether disabling this piece of move ordering code may help?

I would also cross-check this part:

Code: Select all

        size_t sortStartOffset = reorderAndSortList(moves, putInFront);

        moves -> sort(moveSortVictimType, (void *)b, sortStartOffset);

as well as the PV handling. Regarding the latter, how do you organize your PVs? One per node, or only one global PV (which is usually incorrect in some cases, as has been discussed here a while ago)?

EDIT: Don't get me wrong ... but I tend to say that this qsearch implementation is a bit too complex and hard to read in general, even though there are many things in it that actually make sense.

Sven · Post by **Sven** » Fri Dec 25, 2015 11:06 pm

Sven Schüle wrote:
flok wrote:
Sven Schüle wrote:As Matthew already pointed out, I think one of the bugs is that you simply do not use the stand-pat principle, which you will usually need for QS to be efficient.
I do actually! You may recognize this code a bit from an earlier chess program of mine which was written in Java (this is c++):
Code: Select all
int Brain::quiesceSearch(search_data_t *const sd, const int currentSearchDistance, const PlayerColor c, int alpha, int beta, const zobrist_t & zh, std::atomic_bool *const stopFlag, const Move & siblingPrevBestMove, Move * const selMove, MoveList *const pv)
{
...
}
Yes, I remember! After getting this new information I suspect a move ordering problem. You put something special to the front of the move list. What is that "siblingPrevBestMove"? At a first glance it does not look obvious to me, and its handling looks a bit suspicious. You pass it as an input parameter so you seem to know in advance "somehow" that it might be a good move that should be sorted to the top of the movelist, but how do you know? The variable name suggests that this move has been the best reply (or a cutoff reply) at a sibling of the current node. But is that really a good idea in qsearch? Would you consider to check whether disabling this piece of move ordering code may help?

I would also cross-check this part:
Code: Select all
        size_t sortStartOffset = reorderAndSortList(moves, putInFront);

        moves -> sort(moveSortVictimType, (void *)b, sortStartOffset);
as well as the PV handling. Regarding the latter, how do you organize your PVs? One per node, or only one global PV (which is usually incorrect in some cases, as has been discussed here a while ago)?

EDIT: Don't get me wrong ... but I tend to say that this qsearch implementation is a bit too complex and hard to read in general, even though there are many things in it that actually make sense.

EDIT 2 (> 15 minutes): "siblingPrevBestMove" seems to be dead code since it is always "invalidMove" (due to "curBestPrevMove = qsSelMove;" where "qsSelMove" is never set to something different from "invalidMove"). So this is not the reason for having slightly too many QS nodes.

Maybe your numbers (87% of all nodes are QS nodes) are not bad enough to search for a bug, you might simply need some smaller improvements, like skipping losing captures, to reach roughly 80% QS rate. But I would worry more about your overall branching factor which is somewhere between 5 and 7.

flok · Post by **flok** » Sat Dec 26, 2015 12:27 am

Sven Schüle wrote:Yes, I remember! After getting this new information I suspect a move ordering problem. You put something special to the front of the move list. What is that "siblingPrevBestMove"? At a first glance it does not look obvious to me, and its handling looks a bit suspicious. You pass it as an input parameter so you seem to know in advance "somehow" that it might be a good move that should be sorted to the top of the movelist, but how do you know? The variable name suggests that this move has been the best reply (or a cutoff reply) at a sibling of the current node. But is that really a good idea in qsearch? Would you consider to check whether disabling this piece of move ordering code may help?

I think you understand what it does yes but just to be sure:

For example

:

Consider the top of the tree, nodes 3/5/7/8; node 5 gets the best move found in node 3, node 7 gets the best of the one node 5 and so on.

I do my tests in two ways:
- cutechess-cli and let them play 200-1000 rounds (until + and - are in reasonable bounds)
- time-to-depth.py which invokes a search for a given ply-depth and do that 100-1000 times. then determine the average t-t-d (discarding the lower and upper 10% of the results)

I had tested that move-ordening for regular search but not for qs I now realise.

Did a quick-test with only 11 iterations for the t-t-d.
- all 3 move-to-fronts: 1.39s
- without sibling: 1.40s
- without tt-hit: 1.41s
- without pv: 1.41s

Hmmm.

Further testing showed that for a depth of 7 with 855176 qs-nodes, only 29638 had one or more moves to move to the front of the sort-list and in fact all of those 29638 had only 1 entry to move to the front.

Code: Select all

  29354 SRT tt 
    283 SRT pv tt 
      1 SRT pv

An other search (same starting position) shows:

2015-12-25 22:24:40.304816 nps 102673.789026 (q: 863579, reg: 126299, q limit: 0), total thread count: 4950, threads/s: 0.000513, qtt: 0, tt: 32543, nodes: 126299, nullmove: 2770, iid: 0

So 30k movelist-ordening-hits and the same amount of tt-hits, good.
No qs-tt hits, that is odd and another thing to look into.

Some other test shows:

Code: Select all

  76176 SRT hit
  22755 SRT nohit

Which means that apparently there are tt-result that are not in the movelist (hit is in the list, nohit means that the tt-result is not in the movelist). I would expect that to be not likely unless the hashing in my tt is very bad (regular zobrist, filled with std::uniform_int_distribution and std::random_device, checked for duplicates).

I would also cross-check this part:

Code: Select all

        size_t sortStartOffset = reorderAndSortList(moves, putInFront);

        moves -> sort(moveSortVictimType, (void *)b, sortStartOffset);

The sort is nothing special:

Code: Select all

void MoveList::sort(int (*compare_func)(const void *m1, const void *m2, void *arg), void *arg, const size_t startOffset)
{
        if (nIn)
                qsort_r(&moves[startOffset], nIn - startOffset, sizeof(Move), compare_func, arg);
}

and the reorderAndSortList also isn't - it moves the moves listed in that vector to the front of the list of moves:

Code: Select all

size_t reorderAndSortList(MoveList *const ml, std::vector<Move> & putInFront)
{
        size_t moveToPos = 0, pifSize = putInFront.size();

        if (pifSize == 0)
                return 0;

        const size_t mlSize = ml -> size();

        for(size_t i=1; i<mlSize; i++)
        { 
                for(size_t j=0; j<pifSize; j++)
                { 
                        if (moveCompareTo(ml -> at(i), putInFront.at(j)))
                        { 
                                if (i != moveToPos)
                                        ml -> swap(moveToPos, i);

                                moveToPos++;
                                putInFront.erase(putInFront.begin() + j);

                                if (--pifSize == 0)
                                        return moveToPos;
                                break;
                        }
                }
        }

        return moveToPos;
}

I verified that the ones moved to the front stay in the front, also after the sort.

Another test shows that it are mostly moves related to king:

Code: Select all

   1884 blackblack k	
    631 whitewhite k	
     30 whitewhite p	
     26 blackblack r	
     16 blackblack p	
      7 whitewhite r	
      6 blackblack q	
      5 blackblack b	
      2 blackblack n	
      1 whitewhite k	blackblack k

and that it is always a pair of black/black or white/white shows that most likely the move in the tt is pointing to the correct piece on the board. Or else it could sometimes point (fx/fy) to a piece from the other color.

For pv-ordening it is a different story:

Code: Select all

   2813 blackNULL	
   2137 whiteblack p	
    479 whiteNULL	
    314 blackwhite b	
     91 blackblack p	
     31 blackwhite q	
     18 
     16 NULL	
     13 black p	
      7 whiteblack p	black
      6 whiteblack b	
      5 whiteblack p	white
      4 whitewhiteblack p	black p	
      3 whiteblackblack p	NULL	
(and so on)

But yeah that makes sense. Maybe I should store a tt-hash together with each pv-entry. Or maybe have a seperate pv-tt.

as well as the PV handling. Regarding the latter, how do you organize your PVs? One per node, or only one global PV (which is usually incorrect in some cases, as has been discussed here a while ago)?

At the end of the tree, where the evaluate() is invoked, I start for each node there a new movelist. This movelist will contain the moves from the end of the tree al the way to the root. So each node below an other node takes one of the generated movelists, throwing away the rest and then passing that one to its caller (the movelist (or pv) for the selected move).

EDIT: Don't get me wrong ... but I tend to say that this qsearch implementation is a bit too complex and hard to read in general, even though there are many things in it that actually make sense.

Is it? I won't deny it, it could be. But I don't see any code that can be removed. Well maybe the seperate tt for qs-search (as we speak I'm running a cutechess-cli test between a version with, without and tscp (from which it wins occasionally) to see if it helps at all to have a seperate qs tt <--- I wrote that before I noticed that there were no qs-tt hits at all).

Hmmm.

------

In response to your second reply: that indeed is not so good. curBestPrevMove should have been updated to mDummy.
Time-to-depth with 101 tests for the fixed version: 1.42
The one with the fix: 1.44
So marginally if at all.
It has hits:

Code: Select all

 464701 SRT w/o
   6124 SRT with

...but only 1.3%.

Maybe it is the sort-criteria?

Code: Select all

inline int sortValue(const Move *const m, const Board *const b)
{
        ChessPiece *attacker = b -> getAt(m -> fx, m -> fy); // from
        ChessPiece *victim   = b -> getAt(m -> tx, m -> ty); // to

        int attackerValue = attacker -> getEvalVal();
        int victimValue = victim ? victim -> getEvalVal() : 0;
        int promoteValue = m -> promoteTo != NONE ? ChessPiece::evalVal[m -> promoteTo] : 0;

        return ((victimValue + promoteValue) << 16) | (32767 - attackerValue);
}

int moveSortVictimType(const void *pm1, const void *pm2, void *arg)
{
        const Move *m1 = (const Move *)pm1;
        const Move *m2 = (const Move *)pm2;
        Board *b = (Board *)arg;

        return sortValue(m2, b) - sortValue(m1, b);
}

Unless I made a bug here or there I think this is almost literally what the chess programming wiki specifies

Maybe your numbers (87% of all nodes are QS nodes) are not bad enough to search for a bug, you might simply need some smaller improvements, like skipping losing captures,

Right. That should not be too complicated to implement.
Does one do that usually recursive? Or just: "if I kill that piece, with what piece can I then be killed" and just that (e.g. no further search)?

to reach roughly 80% QS rate. But I would worry more about your overall branching factor which is somewhere between 5 and 7.[/quote]

reducing the number of evaluated positions in quiescence sea

reducing the number of evaluated positions in quiescence sea

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence

Re: reducing the number of evaluated positions in quiescence