Q-Search efficiency

lholmes135 · Post by **lholmes135** » Sat May 07, 2022 2:50 am

Hello everyone,
I am working on an engine and recently implemented q-search. My q-search is slow and I spend as much time in it as I spend in regular evaluation (although my regular evaluation is currently just material and piece-square tables). It is halving my number of nodes per second (I don't count nodes in q-search). I am looking for advice on how I can speed it up. My code is below, and a few notes on it:

-I don't use MVV/LVA or SEE. Instead, I order my moves by piece value. My pawn capturing an enemy queen I value at +800 priority units, and my rook capturing an enemy knight I value at -200 priority units. My king capturing anything at all is valued at +1000 priority units. Then I search the move with the highest priority units first. I feel like makes more sense than MVV/LVA where RxR is searched before PxB. I put the king captures at +1000 because I feel like any legal king captures tend to be very good.

-I store evaluation/alpha/beta in the node itself, rather than passing by value.

-The function generateCaptures not only generates captures, but orders them by the method I gave above.

Code: Select all

void quiescenceSearch(boardState* board) {
	
	assert(board != NULL);
	
	board->alpha = board->parentNode->alpha;
	board->beta = board->parentNode->beta;
	board->depth = board->parentNode->depth+1;
	
	evaluate(board);
	
	if (board->whoseTurn == WHITE) {
		if (board->evaluation > board->alpha) {
			board->alpha = board->evaluation;
		}
	}
	else if (board->whoseTurn == BLACK) {
		if (board->evaluation < board->beta) {
			board->beta = board->evaluation;
		}
	}
	if (board->alpha > board->beta) {
		return;
	}
	
	generateCaptures(board);
	
	boardState* node = board->childNode;
	
	while (node != NULL) {
		quiescenceSearch(node);
		
		if (board->whoseTurn == WHITE) {
			if (node->evaluation > board->alpha) {
				board->alpha = node->evaluation;
			}
			if (node->evaluation > board->evaluation) {
				board->evaluation = node->evaluation;
			}
		}
		else if (board->whoseTurn == BLACK) {
			if (node->evaluation < board->beta) {
				board->beta = node->evaluation;
			}
			if (node->evaluation < board->evaluation) {
				board->evaluation = node->evaluation;
			}
		}
		
		if (board->alpha >= board->beta) {
			deleteAllUnderneath(board);
			return;
		}
		node = node->nextNode;
	}
	
	deleteAllUnderneath(board);
	
}

hgm · Post by **hgm** » Sat May 07, 2022 8:00 am

For one, it does not make more sense. If you do PxB before RxR the opponent will usually reply with RxR (as the attack works both ways). You will have lost the exchange in these two ply, and now you have to search capturing that Rook anyway (in its new location). Or if you can't, see PxB fail low after having wasted time on it, and try RxR as an alternative anyway. OTOH, if you try RxR and the Rook was unprotected, you have gained a Rook. And if it gets recaptured, you folow it up by the PxB, which usually will not have gone away. (The chances that it was especially this Bishop that recaptured are small.)

So it makes sense that doing RxR before PxB is better. And guess what: hundreds of programmers have tried it in practice, and all found that it was.

To get higher nps, the simplest way is to also count QS nodes.

Most engines do. Some 85% of a typical middle-game tree consists of QS nodes.

expositor · Post by **expositor** » Sat May 07, 2022 9:01 am

I don't have much to add to hgm's answer but thought it might be useful to supply an example with concrete numbers.

For Expositor in midgame positions, approximately 30% of search nodes are from the interior of the main search tree, 40% of search nodes are leaf nodes of the main search tree (which are the roots of quiescing searches), and 30% are extension nodes, i.e. quiescing search nodes that are not the leaves of main search.

Static exchange analysis is reasonably cheap. Expositor and Admonitor use static exchange analysis for move ordering in both main search and quiescing search, and for both engines it accounts for 4% of the runtime in midgame positions. I have never quantified its overall effect on playing strength, but the effect is very significant.

This is not to say that you should aim for these numbers. I'm merely attempting to 1 corrobate hgm's point that it's not unusual for the majority of nodes to be in quiescing searches and 2 suggest that static exchange analysis is usually worth it (if you are interested in implementing the feature).

Other engine authors may chime in with additional examples.

(Best of luck, by the way, and welcome to TalkChess!)

Tearth · Post by **Tearth** » Sat May 07, 2022 10:56 am

Good SEE is a key, I've got a massive strength gain (like +100 Elo) in the past because my move ordering was so improved when I introduced it in the place of MVV-LVA - it's definitely the first thing if someone is looking for improvements and still doesn't have it done. Inanis uses it both in regular in quiescence search, similarly as expositor described.

hgm · Post by **hgm** » Sat May 07, 2022 1:10 pm

Note that the main effect of SEE is to postpone the search of High x Low captures when the victim is protected. Always postponing those is counterproductive, because most pieces in QS in an engine that uses null-move pruning are actually hanging. (But because you did not gobble them up yet, because you preferred null move.) But when you search the HxL capture of entirely unprotected pieces when their turn comes up in the MVV/LVA order, but postpone the other HxL captures ('BLIND'), you already have most of the beneficial effect a full SEE would give you. It doesn't occur very often that pieces are multiply attacked and multiply protected.

abulmo2 · Post by **abulmo2** » Sat May 07, 2022 6:08 pm

hgm wrote: ↑Sat May 07, 2022 1:10 pm Note that the main effect of SEE is to postpone the search of High x Low captures when the victim is protected.

To postpone the search of bad captures is only the half effect of SEE. SEE is also used to prune bad captures, particularly in quiescence search. In my two programs (Amoeba and Dumb) , these both usages of SEE are necessary to effectively gain some Elo.

abulmo2 · Post by **abulmo2** » Sat May 07, 2022 9:01 pm

My advices to make qsearch faster / more correct:
- use negamax instead of minmax. The code will be much simpler and less error prone. I am not sure your minmax implementation is correct. The min/max nodes are confused with the color of the player. I wonder if you are returning the bestscore or the last score when no alphabeta cutoff is found.
- When in check you should not return the evaluation but generate all (evading check) moves and play them until a cutoff is found.
- Use MVV/LVA to order your moves. It will reduce the size of the search tree compared to what you are using.
- You can use SEE to prune the bad captures during the quiescence search. SEE is slow to compute. It will make you program slower in term of nodes per second but it will diminish the size of the search tree
- Your node structure is probably more complicated and slower than passing alpha and beta as function parameters. On modern hardware and OS, the first parameters are passed by register (fast access), your structure use the memory (slow access).

Here is the quiescence search I am using in Dumb with comments :

Code: Select all

// qs is a member function of the Search class
// it returns the best score found according to the α (alpha) and β (beta) bounds.
int qs(int α, int β) {
		int s, bs; // s = score and bs = best score
		Moves moves = void; // list of move items; = void  make the list uninitialized 
		MoveItem i = void; // a move item (move + move score for sorting)
		Move m; // a move

		++qsNodes; // I count the nodes!

		if (board.isDraw) return 0; // if the position is a draw return 0; board is part of the search class

		bs = ply - Score.mate; // worst possible score at this depth
		s = Score.mate - ply - 1; // best possible score at this depth
		if (s < β && (β = s) <= α) return s; // mate pruning: β = min(s, β) and return if a cutoff is found

		if (!board.inCheck) { // if not in check
			bs = eval(board); // the best score is the static evaluation of the position
			if ((bs > α) && (α = bs) >= β) return bs; // α = max(bs, α) and return if a cutoff is found
		}

		if (ply == Limits.ply.max) return eval(board); // search is too deep, stop it now!

		moves.generate!false(board, history); // generate all captures and promotion to queen

		while ((m = (i = moves.next).move) != 0) if (board.inCheck || i.isTactical) { // for all move in the list if the position is in check or the move is good capture with see > 0
			update!false(m); // update the position and the evaluation
				s = -qs(-β, -α); // compute the score using the negamax approach of the quiescence search
			restore(m); // restore the position
			if (s > bs && (bs = s) > α && (α = bs) >= β) return bs; // bs = max(s, bs); α = max(α, bs) and return the score if a cutoff is found
		}
		return bs; // return the best score found so far
	}

lholmes135 · Post by **lholmes135** » Mon May 09, 2022 3:44 am

Thanks for all of the responses! It seems like the general advice is that I should drop my current move ordering scheme and replace it with SEE, and to pass alpha/beta by value rather than storing in the node structure. I still have a few questions though:
1. abulmo2 suggests I use MVV/LVA for move ordering, and SEE for pruning. I thought these were both move ordering schemes? With SEE I will calculate a SEE value for each square with an enemy piece, and then prioritize the highest values first, and skip those with a value of 0. Is this what you mean by pruning?
2. I will miss some good moves in this kind of situation: Say a square has an SEE evaluation of 0, so I skip it. However, partway through the exchange a discovered attack opens up on an enemy piece. I would have come out ahead due to this discovered attack. I assume this kind of miss is acceptable, and I should expect to find such tactics during the regular search instead?

Tearth · Post by **Tearth** » Mon May 09, 2022 9:50 pm

lholmes135 wrote: ↑Mon May 09, 2022 3:44 am 1. abulmo2 suggests I use MVV/LVA for move ordering, and SEE for pruning. I thought these were both move ordering schemes? With SEE I will calculate a SEE value for each square with an enemy piece, and then prioritize the highest values first, and skip those with a value of 0. Is this what you mean by pruning?

It can be both (and this is how I use it, no MVV-LVA at all), in quiescence search I prune all moves for which SEE has determined that the capture sequence will be probably a loss. It's also a part of futility pruning criterium (also called as "delta pruning", but I'm not a fan of this name).

lholmes135 wrote: ↑Mon May 09, 2022 3:44 am 2. I will miss some good moves in this kind of situation: Say a square has an SEE evaluation of 0, so I skip it. However, partway through the exchange a discovered attack opens up on an enemy piece. I would have come out ahead due to this discovered attack. I assume this kind of miss is acceptable, and I should expect to find such tactics during the regular search instead?

I would say yes - in principle, SEE is never 100% perfect because primarily it should be fast, so we can use it as a tool to check if the capture sequence has a high probability to be good or bad for us. Some small errors aren't that catastrophic, since the search tree is usually very resistant and you really can't break it easily even if the SEE will sporadically return a false outcome.

hgm · Post by **hgm** » Wed May 11, 2022 10:30 am

abulmo2 wrote: ↑Sat May 07, 2022 6:08 pm
hgm wrote: ↑Sat May 07, 2022 1:10 pm Note that the main effect of SEE is to postpone the search of High x Low captures when the victim is protected.
To postpone the search of bad captures is only the half effect of SEE. SEE is also used to prune bad captures, particularly in quiescence search. In my two programs (Amoeba and Dumb) , these both usages of SEE are necessary to effectively gain some Elo.

Well, when you know a piece is protected, you usually also know what protects it. If that doesn't make up for the HxL value difference, there is no way SEE can be non-negative, and you can in any case prune those. Also, if there is no second attacker H x protected L is always bad, and you could have kept track of that.

The thing that annoys me most is that for detecting whether a piece is protected you have to generate opponent captures. In orthodox Chess there still are shortcuts for that (compared to full move generation), e.g. the super-piece method. I am often dealing with chess variants where the super-piece method is not competitive, because attacks can come from unexpected directions (like from behind other pieces, as with a Cannon in Chinese Chess, or from sliders that move around corners).

One could try to save some time by calculating it 'semi-incrementally': generate moves for the opponent in the current position (as you might do for evaluating mobility), so that you know what was protected before the move. And then for each individual capture test whether that capture would alter the protected status of its victim. I think that in orthodox Chess a capture can only add a protector to its victim, by discovering an X-ray protector, and never subvert existing protection. A Cannon protector could be subverted if it was using the attacker as a mount, so you would then have to know if it was the only protector. And for sliders with bent trajectories finding an X-ray protector can again be expensive.

So I was thinking: what about searching the HxL capture 'speculatively': start a normal search, which starts with move generation, but abort the node (failing low) if this turns up a move that recaptured. (As if the capturing piece became a king temporarily.) The caller could see the search was aborted by the returned code. (Only values between -32K and +32K would be valid scores, so returning an int leaves plenty of invalid codes for signalling). It could then decide either to prune or reschedule the search of that capture, depending on whether there is a second attacker and the value of the lowest protector. (Which would also be identified in the return code.)

The obvious inefficiency here is that when you postpone the search, you would redo the move generation. But I wonder if this couldn't be avoided. I usually store the generated moves on a global stack. Each invocation of Search() would then remember the old stack top, and push its own moves on top of it. Just before it returns it then pops its own move list.

So what if you would not pop the move list on aborting a search, but return its start address to the caller? This would accumulate a number of daughter move lists on the move stack above that of the caller, one for each aborted HxL capture. But when the caller returns it would pop its own move list, and everything above it with it. For each move that was considered worth postponing the caller could remember where the move list of the daughter starts, and pass that info to Search() when it decides to search the move after all. When Search() would be passed a null pointer for its move list, it would generate moves to create such a list on the top of the move stack. But otherwise it could skip move generation, and just use the list it was passed. Basically it would just resume right after the point where it was aborted.

Q-Search efficiency

Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency

Re: Q-Search efficiency