Updating the attack map is expensive, especially since restoring it to its old state will be almost equally expensive. So it would be a significant waste if we did it in vain, i.e. without using any of the information in it. This would happen when after the move we would have a stand-pat cutoff on the lazy evaluation. (For now we will assume that a full evaluation would need a fully updated attack map.) But futility pruning already takes care of that, by pruning moves that do not raise the lazy evaluation enough to get near alpha:
The other case where the info in the attack map is 'under-used' is basically the reverse of this: when our move raises the evaluation so much above beta that the opponent cannot possibly push it back below beta. But the condition
would not work, as it only means the opponent's stand-pat attempt is futile. The opponent could still have good captures (perhaps even of our King?) that push the lazyEval down (from the moving player's POV) a lot. So if we want to do a fail-high-without-search, we must also make sure all the opponent's captures are futile:
The question now is: would it be possible to decide on this without updating the attack map first? With an updated attack map the Overkill() test would be pretty easy:
gap = lazyEval - beta - MARGIN; // what opponent must capture to jeopardize our fail-high
victims = summary[OPPONENT][SIDE_THAT_MOVES]; // attack counters of our pieces
if(victims) {
int bit = BSF(victims);
int mvv = stm + 15 - (bit >> 2); // calculate number of piece corresponding to this counter
if(value[mvv] > gap) return 0;
}
return 1;
Now in practice this gets a bit more cluttered, because the summaries are split into Pawns and non-Pawns, both attacker-wise and victim-wise. So we would get
victims = summary[OPPO_P][STM_X] + summary[OPPO_X][STM_X]; // total attacks (by P and non-P) on moving player's non-P
if(victims) { // some of our non-P are attacked
int bit = BSF(victims); // extract most valuable
int mvv = stm + 15 - (bit >> 2);
if(value[mvv] > gap) return 0;
}
if(value[PAWN] < gap) return 1; // Pawn doesn't cut it
victims = summary[OPPO_P][STM_P] + summary[OPPO_X][STM_P]; // total attacks (by P and non-P) on moving player's P
if(victims) {
return 0;
}
return 1;
When we do this with a not-yet-updated attack map, there are several causes of error:
1) the identified victim that spoiled the fun could have moved away
2) that victim could have been attacked only by the piece that was just captured, so it is in fact now safe
3) the only capture of that victim was blocked by the move we played
4) the piece that we move could be attacked in its new location
5) the piece that moved away could have been soft-pinned, so that moving it exposes another piece
Now (3) can only happen on a non-capture, and since only 6% of the moves were non-captures, we don't worry about that case, and just never overkill-prune on those. (1) and (2) apply to attacks in the stale attack map that have become invalid. So when we find an attack through the above procedure, we would have to vet it for not being one of these two cases. And if it fails that test, we would have to keep looking for a victim that is really attacked. But none of that is very difficult:
victims = summary[OPPO_P][STM_X] + summary[OPPO_X][STM_X]; // total attacks (by P and non-P) on moving player's non-P
while(victims) { // loop through our attacked non-P
int bit = BSF(victims); // extract most valuable
int mvv = stm + 15 - (bit >> 2); // piece number of non-P for this counter
if(value[mvv] < gap) return 1; // this (and all future) not valuable enough
if(mvv != u->piece) { // the piece that just moved is no longer there!
int unit = victim2unit[mvv-16]; // mask with lowest bit of counter for this victim
int counter = victims & 15*unit; // isolate counter
if(counter > unit) return 0; // the piece is still there, valuable enough, and multiply attacked
// the piece is attacked only once; test whether the attacker survives the current move
int kind = u->victim >> 3 & 3; // determine kind of attacker (w/b, P/non-P)
if(attacker2bit[u->victim] != attackers[kind][mvv]) return 0; // was not captured this move
}
bit &= clearMask[bit];
}
if(value[PAWN] < gap) return 1; // Pawn doesn't cut it
victims = summary[OPPO_P][STM_P] + summary[OPPO_X][STM_P]; // total attacks (by P and non-P) on moving player's P
while(victims) { // loop through our attacked non-P
int bit = BSF(victims); // extract most valuable
int mvv = stm + 7 - (bit >> 2); // piece number of P for this counter
if(mvv != u->piece) { // the piece that just moved is no longer there!
int unit = victim2unit[mvv-16]; // lowest bit of counter for this victim
int counter = victims & 15*unit; // isolate counter
if(counter > unit) return 0; // the piece is still there, valuable enough, and multiply attacked
// the piece is attacked only once; test whether the attacker survives the current move
int kind = u->victim >> 3 & 3; // determine kind of piece that was captured (w/b, P/non-P)
if(attacker2bit[u->victim] != attackers[kind][mvv]) return 0; // this was not the attacker of the MVV
}
bit &= clearMask[bit];
}
return 1;
Umm, this already starts to become complex. Still far less work than updating + restoring the attack map, though, for which we would have to generate (amongst others) moves for 3 pieces (mover in old & new location and captured piece) both on update and restore. And very often not much of it has to be executed. It would be uncommon that the first piece extracted in one of the loops would be the moved piece, or would have the captured piece as its only attacker. So typically only a single iteration of one of the two loops would be performed.
The bad thing is that this was not all yet: we still need to address cases (4) and (5)! It was assumed we would do that first. Case (4) is the easier one: the moved piece will now have the former protectors of the piece that was captured as attackers. In other words, if the moved piece is valuable enough that its loss would spoil the fun, we have to test whether its victim was protected.
if(value[u->piece] > gap) {
int kind = u->victim >> 3 & 3; // determine kind of victim (w/b, P/non-P)
int protectors = summary[OPPO_P][kind] + summary[OPPO_X][kind]; // attack counters on pieces of this kind
int counter = 15*victim2unit[u->victim]; // counter bits for this piece
if(protectors & counter) return 0; // the counter was non-zero (i.e. the victim was protected)
}
That leaves case (5), the (soft-)pinning. For this we would have to loop through all opponent sliders that attacked the moved piece, and test whether what they hit instead belongs to the moving player, and is valuable enough. With the especially nasty case that what they hit now seems an opponent, but actually si the piece that the current move will capture, so that it is really the mover. (And the discovered slider attack that captured it was not amongst the protectors of the captured piece; it was an X-ray.) The upside is that only a small minority of the pieces will be attacked by an enemy slider. So very often there is nothing to do at all.
int sliders = attackers[OPPO_X][u->piece] & SLIDERS; // slider attackers of the moving piece
while(sliders) {
int bit = BSF(sliders);
int piece = xstm + 8 + (bit >> 2); // enemy piece corresponding to this bit in attackers set
int from = location[piece]; // where the attack was coming from
int d = direction[u->from-from]; // direction in which it was going
int to = neighbor[u->from].d[d]; // downstream target
if(to == u->to) { // moving piece discovered an attack on itself!
if(value[u->piece] > gap) return 0; // this spoils the fun
} else {
int target = board[to]; // what is found there
if(!(target & xstm)) { // it is a piece of the moving player (not enemy or edge guard)
if(value[target] > gap) return 0; // this also spoils the fun
}
}
sliders &= sliders - 1;
}
Note that the presented code for the Overkill routine is exactly what you have to do for testing whether a move exposes your King; only in that case 'gap' has to be set larger than Queen. But it is rather inefficient for that; it would be much faster to just determine the direction the evacuated square lies in from the King, and extend a virtual slider attack of the King to that square, testing whether it hits a slider moving towards it. The test whether the captured piece was protected could serve a dual purpose, though. By moving that out of Overkill(), and doing it for any move being made, we could use the information to abort King moves, as well as for refraining from searchless fail-highs:
The bottom line is that the Overkill() test is doable, and on average pretty fast. But that it will require a lot of (rather bug-prone) code to handle all special cases.
Well, I started implementing all this, for speed testing. But I pretty much have to rewrite everything. To start with something relatively simple, which I can hope to get quickly running without bugs, I made a routine MapFromScratch(). This runs through the piece list, and calls AddMoves() for every piece on the board to add its moves to an (initially initialized empty) attack map. I can use that as a reference for checking whether any incremental update reproduces it. But for now I just implemented a stack of attack maps, so it can create a fresh map in every node without disturbing the maps of any node closer to the root.
#define ATTACKERS(KIND, VICTIM) attackers[VICTIM - 16 + 32*(KIND)]
int attackersStack[12800];
int summaryStack[400][4];
int *attackers = attackersStack;
int (*summary)[4] = summaryStack;
int attacker2bit[] = {
1, 1<<4, 1<<8, 1<<12, 1<<16, 1<<20, 1<<24, 1<<28,
1, 1<<4, 1<<8, 1<<12, 1<<16, 1<<20, 1<<24, 1<<28,
1, 1<<4, 1<<8, 1<<12, 1<<16, 1<<20, 1<<24, 1<<28,
1, 1<<4, 1<<8, 1<<12, 1<<16, 1<<20, 1<<24, 1<<28,
};
int victim2unit[] = {
1<<28, 1<<24, 1<<20, 1<<16, 1<<12, 1<<8, 1<<4, 1,
1<<28, 1<<24, 1<<20, 1<<16, 1<<12, 1<<8, 1<<4, 1,
1<<28, 1<<24, 1<<20, 1<<16, 1<<12, 1<<8, 1<<4, 1,
1<<28, 1<<24, 1<<20, 1<<16, 1<<12, 1<<8, 1<<4, 1,
};
int AddMoves(int piece, int sqr, int addsub)
{ // generate moves for the piece from the square, and add or remove them from the attack map
int kind = piece >> 3 & 3;
int *sum = summary[kind]; // summary section for this kind of attacker
int *att = &ATTACKERS(kind, 0); // attack map section for attackers of this kind
int bit = attacker2bit[piece-16]*addsub; // addsub is 1 or -1, and determines we set or clear
if(code[piece] & 0200) { // piece is slider
int d, dir = firstSlide[slideOffs[piece-16] + sqr];
while((d = slides[dir++])) { // direction nr (+1)
int to = neighbor[sqr].d[d-1]; // the directions were offset by 1 to avoid the 0 sentinel
int victim = board[to];
if(victim != COLOR) { // ignore edge guards
int victimKind = victim >> 3 & 3;
int unit = victim2unit[victim-16]; // LSB of counter for this victim
sum[victimKind] += unit; // increment / decrement counter
att[victim] += bit; // set/clear the bit for this attacker
}
}
} else {
int v, dir = firstLeap[slideOffs[piece-16] + sqr];
while((v = leaps[dir++])) {
int to = sqr + v; // target square
int victim = board[to]; // occupant of that square
if(victim) { // ignore empty squares
//printf("to=%02x, victim=%d\n", to, victim);
int victimKind = victim >> 3 & 3;
int unit = victim2unit[victim-16]; // LSB of counter for this victim
sum[victimKind] += unit; // increment / decrement counter
att[victim] += bit; // set/clear the bit for this attacker
}
}
}
}
The first step will then be to implement the new method of capture generation & sorting using these from-scratch maps. To eventually search the extracted captures I already wrote a routine SearchCapture(), which can be called with a move (specified as (attacker, victim) rather than (fromSqr, toSqr), passed in the UndoInfo struct) to make, search and unmake it, and process the returned score. To make the latter possible the alpha of the caller (which tracks the best score in this fail-hard implementation) is now also put in the UndoInfo struct (under the name parentAlpha), so that SearchCapture() has access to it.
typedef struct {
long long int hashKey, oldKey; // keys
Neighbor savedNeighbor; // 8 neighbors
int pstEval, oldEval, curEval; // scores
int alpha, beta, parentAlpha; // scores
int from, to; // squares
int piece, victim, nextVictim; // pieces
int depth; // depth
} UndoInfo;
int SearchCapture(UndoInfo *u)
{ // make / search / unmake and score minimaxing all in one
// decode move
int score;
u->to = location[u->victim];
// detect futility at earliest opportunity
u->pstEval = u->oldEval + PST(u->victim, u->to);
if(u->depth <= 0 && u->pstEval < u->parentAlpha - MARGIN) return 1; // futility (child will stand pat)
// finish decoding move and updating evaluation
u->from = location[u->piece];
u->pstEval = -(u->pstEval + PST(u->piece, u->to) - PST(u->piece, u->from));
// update hash key, and possibly abort on repetition
u->hashKey = u->oldKey ^ KEY(u->piece, u->to) ^ KEY(u->piece, u->from) ^ KEY(u->victim, u->to);
// if(REPEAT) score = 0; else
// committed to move
{
// apply move to board
board[u->from] = 0;
board[u->to] = u->piece;
location[u->piece] = u->to;
location[u->victim] = CAPTURED;
Evacuate(u->from); // update neighbor table
attackers += 128; // reserve space for fresh attack map
summary += 4;
MapFromScratch(attackers, (int*) summary);
u->beta = -u->parentAlpha;
score = - Search(u);
// unmake move
board[u->from] = u->piece;
board[u->to] = u->victim;
location[u->piece] = u->from;
location[u->victim] = u->to;
attackers -= 128;
summary -= 4;
Reoccupy(u->from); // restore neighbor table
}
// minimax
if(score > u->parentAlpha) {
u->parentAlpha = score;
if(score + u->alpha >= 0) { // u->alpha = -parentBeta!
return 1; // signal cutoff to caller
}
}
return 0; // no cutoff requested
}
The generation of the attack map seems to work; I used it on the KiwiPete position, and visual inspection of the resulting map did not reveal any errors:
OK, some more results. Because the time needed to do an 8-ply search was becoming a bit short (leading to a large relative error in the timing), I switched to doing 9-ply searches. For the nps this makes no difference, as the result of the old version (based on the neighbor table) shows:
I now completed a version that generates the captures from an attack map. But it still creates the map from scratch in every node. Which of course defeats the purpose, as it generates these maps by normal (by attacker) move generation, and just stores them differently. So basically it is just using the intermediate stage of the map for sorting the captures it generates. So a slowdown can be expected. (Especially since it has to clear the entire map before it can start adding the attacks to it, something you don't have to do for a move list.) But even then my first test was disappointing:
Only 2.3Mnps! (But at least the same scores and PVs; since the move ordering is somewhat different we cannot expect exactly the same node counts.) But of course I was doing something stupid here: the MapFromScratch() calculated a complete attack map for both players, while the move generation only needs the attacks by the side to move. Having that for both players is only useful for incrementally updating it 2 ply later. So I changed it to only generate attacks for one player. After realizing that this would also require me to recalculate a map after null move, this gave the following result:
OK, so we dropped from 6.0Mnps to 4.3Mnps. Part of this is due to the need for clearing the map when generating a new one; when I still cleared the entire map (but then only added the attacks by pieces of the side to move) the speed was 3.4 Mnps (4.26 sec). Clearing only the part for stm attacks on opponent pieces (1/4 of the map) took off 0.92 sec. Extrapolating from that 0.31 sec of the 3.34 still goes into this clearing, and without that we would do slightly over 4.7Mnps. One reason we are slower is that the AddMoves() routine that generates moves for a single piece (adding those to the attack map) does not only generate true attacks, but also protections. (And typical chess positions are such that there usually are many more of the latter.) This is also done for the benefit of the incremental update (because protections turn into attacks when their target gets captured). But it is wasted effort now.
We will have to earn this investment back by generating the attack map of the node through incremental update of the one in the parent. Such an update would require only 3 calls to AddMoves() (two for the moved piece, one for the captured piece), instead of calling it for all pieces. Unfortunately we would have to make these same calls (well, with opposit sign for the modification) to restore the map to the old state. Copy-make doesn't seem very attractive, due to the large map side. There it really hurts that the map is stored in such a sparse format, using only every 4th bit in a word, to make merging easy. Possible cure for that is store the attackers and protectors in the same word (masking away the protectors before merging, through color-dependent move-generation code). Or switch to using every 2nd bit instead of every 4th, so that you can easily merge the attackers of only 2 victims, and do something complex in the (hopefully rare) case that 3 or 4 minors are attacked. Anyway, this is of later concern.
The code that achieved the above result is shown below. To avoid boiler-plate code I implemented the loops that merge the attackers sets and that finally play the moves as a preprocessor macro.
We are now in a position to replace the generation from scratch with an incremental update of the attack map. We will use a make-unmake strategy, i.e. the attack map will be modified in place. The order in which the changes are applied is critical here.
1) 'Disable' the captured piece (i.e. erase all its attacks from the map, possibly including the attack on the moved piece).
2) Disable the moved piece (including its attack on the piece it is now capturing).
3) Discover the slider attacks on the moved piece (from which any of the captured piece have already removed). This could create an attack on the piece we are capturing, which is still sitting on the board as 'dead wood'.
4) Replace the attacks on the moved piece by those on the captured piece. This includes copying of the attackers sets of the latter, as well as copying its summary counters to those of the moved piece.
5) Clear the summary counters of the captured piece. (The attackers sets can stay; these would not be consulted when the summary counter is 0.)
6) Add the moves of the moved piece from its new location.
For step (1) and (2) it is important that the mover and victim are still in their original location, or the attacks to be removed would be mapped to the wrong victim. That also applies for the captured piece in step (3). Step (4) and (5) do not reference the board, and can probably be merged, as when we are doing the bit juggling needed to copy one packed counter to another we might as well take the opportunity to clear the counter we are copying. During (6) the moved piece should no longer be on the from-square, or it would put a protection on itself in the attack map! So the board update will have to take place between (3) and (6).
AddMoves(u, u->victim, u->to, -1);
AddMoves(u, u->piece, u->from, -1);
Discover(u, u->piece, u->from);
int pKind = u->piece >> 3 & 3; // table lookups perhaps faster?
int vKind = u->victim >> 3 & 3; // always != pKind: opposit color
int pShift = victim2bitnr[u->piece]; // location of counter in summary word
int vShift = victim2bitnr[u->victim];
int pMask = ~(15 << pShift); // masks to clear counter bits
int vMask = ~(15 << vShift);
for(kind=0; kind<4; kind++) {
int offset = 32*kind - 16; // offset for attackers of this kind in attack map
aSave[kind] = attackers[offset + u->piece]; // save attackers on old piece location
attackers[offset + u->piece] = attackers[offset + u->victim];
pSave[kind] = summary[kind][pKind]; // save attacks counts on moved piece
vSave[kind] = summary[kind][vKind]; // save attacks counts on captured piece
summary[kind][pKind] &= pMask; // clear attacks counts on moved piece
summary[kind][pKind] |= (summary[kind][vKind] >> vShift & 15) << pShift;
summary[kind][vKind] &= vMask; // clear attacks counts on captured piece
}
// perform move on board
board[u->to] = u->piece;
board[u->from] = 0;
AddMoves(u, u->piece, u->to, +1);
For restoring the map we will need to save the data we irreversibly overwrite, i.e. the summary counters and attackers sets of the moved piece. But it is probably more efficient to remember what we modify, and safe all old values. The unmake would then be a simple loop that copies the stored info back to the map. Otherwise we would have to run through all the steps in reverse order, to reverse their operation. This means AddMoves() will have to safe index (or a pointer to) and value of every attackers set and summary word in which it puts or removes an attack.
That looks pretty simple, but how much work it is depends of course on how many elements got changed. Pieces can make up to 8 attacks, so that alone could give 24 modified attackers sets. In practice some of the attacks will hit the board edge. Or, in the case of leapers, hit empty squares. And some sliders have only 4 attacks in the first place, and Pawns only 2. So the typical number will probably be more like 12. And then there are the summary counters. There are only 16 of those in total. And 8 of them will change for sure, in the process of inheriting the attacks on the victim, and clearing the latter. (They might already have been 0, but to figure that out would probably take even longer.) And the number of attacks on pieces of other kinds could also change, made by the moved or captured piece, but also by discovered sliders of either color. So we might as well assume that all summaries change, and just copy the entire array. Which means that copy-make would be the most efficient way to handle the summaries.
But did you notice total number of nodes searched?
12878122 nodes (6.0 Mnps)
14354167 nodes (2.3 Mnps)
There is no point making it faster if it searches more nodes.
Mike Sherwin wrote: ↑Sat Apr 03, 2021 2:31 pm
But did you notice total number of nodes searched?
12878122 nodes (6.0 Mnps)
14354167 nodes (2.3 Mnps)
There is no point making it faster if it searches more nodes.
what if it also becomes much stronger
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Yes, I did notice that. But I am not sure this is a systematic effect. Tree size can be extremely sensitive to move ordering, in a very noisy way. Swapping the order of two PxN captures in some position can easily double the tree size. To judge whether one move ordering is better than an other you would really have to average the tree size for a thousand positions or so. On a single position the effect is almost entirely decided by luck.
I tried to keep the move ordering a similar as possible, by using a sort key like 16*victimValue - attackerNumber rather than 16*victimValue - attackerValue. So even for equal attackers it prefers to try the one with the lowest piece number first. But there was no predictable order between equivalent victims, if they could be captured by the same piece. It would just depend on the order in which the by-piece generation discovered them. I could try to modify the old version to also work the piece number of the victim in the sort key.
But the main difference in move ordering is because the Pawns were split into two groups, For the PxP captures this could still be mimicked by different MVV keys for the various Pawns. But for the non-P x P captures the order is not strictly LVA anymore. I don't expect this to have a systematic effect, though. You would have to order by SEE to do the best thing here, and be not such much dependent on luck for picking the capture of the unprotected Pawn rather than the protected Pawn. This would have to be tested. It would not be that costly in terms of execution time to force strict LVA order on the non-P x P captures (just extracting from the two merged groups of 4 Pawns simultaneously, and always trying the one with the LVA first). But even if there would be an improvement by this on average, I don't expect that improvement to be nearly as much as you would get by sorting on SEE. Or at least postponing those on protected victims, and then doing the multiply-attacked victims that were not protected by Pawns (could still have SEE > 0 if the protector is more valuable than the attacker), and then those Protected by Pawns (best result 2 Pawns for a piece), and finally those not multiply attacked (one Pawn for a piece). Or perhaps the last two categories should be pruned completely, in QS.
Mike Sherwin wrote: ↑Sat Apr 03, 2021 2:31 pm
But did you notice total number of nodes searched?
12878122 nodes (6.0 Mnps)
14354167 nodes (2.3 Mnps)
There is no point making it faster if it searches more nodes.
what if it also becomes much stronger
It is only a move ordering experiment. At one extreme we can do zero work ahead of time and just do a full move generation to list of moves and do a one pass sort to hopefully get the best move the first time. However, if the tt move or the pv move if there is no tt is played and causes a cutoff then zero pre work is done for all those moves not searched. At the other extreme a boatload of work can be done to incrementally keep attack tables up to date only for all that work to prove useless if the tt/pv move proves to be sufficient. For what Harm is trying to demonstrate to prove favorable it must result in an overall reduction in the number of nodes searched to reach depth in a time per node that does not cancel the benefits of searching fewer nodes. So on average per root position it must search fewer nodes and substantially so. So in this endeavor there must be a gain in search depth in a given time to be able to show a gain in strength. It will prove futile if time per node searched is 'doubled' if the tree size is not more than halved.
INHO this might work for regular search if it were possible that the number of Q-searches is greatly reduced. On the other hand if all that work is also done in Qsearch() then fewer Q-searches may not make a favorable trade off. Then there is the problem of the endgame that has far fewer pieces on the board and yet the amount of pre work stays the same, i.e. pre work / many pieces compared to pre work / few pieces is not going to be easy to justify.
Edit: The above logic supposes that there is some cheap moves like tt/pv or killer moves.
Or as Bob would say, do not do any work until it is necessary because it may not be necessary!
hgm wrote: ↑Sat Apr 03, 2021 3:11 pm
Yes, I did notice that. But I am not sure this is a systematic effect. Tree size can be extremely sensitive to move ordering, in a very noisy way. Swapping the order of two PxN captures in some position can easily double the tree size. To judge whether one move ordering is better than an other you would really have to average the tree size for a thousand positions or so. On a single position the effect is almost entirely decided by luck.
I tried to keep the move ordering a similar as possible, by using a sort key like 16*victimValue - attackerNumber rather than 16*victimValue - attackerValue. So even for equal attackers it prefers to try the one with the lowest piece number first. But there was no predictable order between equivalent victims, if they could be captured by the same piece. It would just depend on the order in which the by-piece generation discovered them. I could try to modify the old version to also work the piece number of the victim in the sort key.
But the main difference in move ordering is because the Pawns were split into two groups, For the PxP captures this could still be mimicked by different MVV keys for the various Pawns. But for the non-P x P captures the order is not strictly LVA anymore. I don't expect this to have a systematic effect, though. You would have to order by SEE to do the best thing here, and be not such much dependent on luck for picking the capture of the unprotected Pawn rather than the protected Pawn. This would have to be tested. It would not be that costly in terms of execution time to force strict LVA order on the non-P x P captures (just extracting from the two merged groups of 4 Pawns simultaneously, and always trying the one with the LVA first). But even if there would be an improvement by this on average, I don't expect that improvement to be nearly as much as you would get by sorting on SEE. Or at least postponing those on protected victims, and then doing the multiply-attacked victims that were not protected by Pawns (could still have SEE > 0 if the protector is more valuable than the attacker), and then those Protected by Pawns (best result 2 Pawns for a piece), and finally those not multiply attacked (one Pawn for a piece). Or perhaps the last two categories should be pruned completely, in QS.
I did not sleep well last night so I probably should not be posting on such a complicated subject. Anyway though the technique I use for move ordering that no one seems to want to try is if depth > 3 (> 2 may be better) is to do a reduced search with a widened window on each of the remaining moves. The formula (depth > 3) is depth/4. All I can do is say that it works gaining on average ~1.5 ply deeper searches in a given time.