Move ordering contest

Don · Post by **Don** » Mon May 27, 2013 6:15 pm

hgm wrote:I never was able to make any sense of move-ordering statistics until I started taking them by depth remaining. At d=2 you will see completely different stats than at d=8, for instance.

I am not sure why the ordering would be more important at higher depth. In the end the number of leaves is just the product of the branching factor at each level, so it seems unimportant where you achieve the reduction. Or is it that at large depth a better order also causes more depth reduction?

I think that is an important point and I have seen this too.

michiguel · Post by **michiguel** » Mon May 27, 2013 6:39 pm

hgm wrote:I never was able to make any sense of move-ordering statistics until I started taking them by depth remaining. At d=2 you will see completely different stats than at d=8, for instance.

I am not sure why the ordering would be more important at higher depth. In the end the number of leaves is just the product of the branching factor at each level, so it seems unimportant where you achieve the reduction. Or is it that at large depth a better order also causes more depth reduction?

It is more important to spend some effort to get a better ordering a higher depths just because it is possible to make it better (some techniques are not even possible near the leaves). Near the leaves the ordering is going to be worse in a better engine. So, screwing the ordering near the leaves has a less impact than screwing it far from the leaves (in current engines).

Better engines will shine at move ordering far from the leaves, and that will not be noticed too much in the ratios we are calculating.

Miguel

Don · Post by **Don** » Mon May 27, 2013 6:40 pm

michiguel wrote:
hgm wrote:I see no reason to count the null move as zero. It is a move that you search. Whether it is legal or not doesn't affect the tree size.

The count it as one, but it should be counted. That is my point. If you have a program without nullmove, it would appear that the move ordering is better since most likely any move that is tried first will get you a cut off. Not taking into account these nodes, penalizes null move programs.

Miguel

There is also no requirement to search the null move first. That is also a programmers decision.

If you can order the real moves such that you can get more cutoffs from the null move, it will shrink the tree much more than when you order the real moves to get an equal increase of the number of cutoffs of the first real move.

Miguel,

When the discussion of move ordering comes up it's all about how the moves are sorted, not some pre-test for pruning. If there is an attack, then the null move is going to fail and we are not judging the quality of the artificially produced "null move", so it's not an issue and not tied to the move generation. That would make the hash table move, if it exists the SECOND move which gives this very odd situation that even if you have PERFECT move ordering, you would still be getting cutoffs on the SECOND move. This is a big part of the reason that the NULL move is NOT a real move. In fact, as I have stated in other posts, it's only a side-effect of some methods used for determining if there is an attack. Null move is really threat move detection and is a black box.

If we extended this to quies, where we sometimes take a "stand pat" cutoff based on the score, do we count the standpat cutoff as a "null" move? The only difference in null move and stand pat in quies is that we are more fussy and that most program implementations don't actually play a "null move" in quies. I would hate to make this semantic and arbitrary distinction.

Some programs don't use null move pruning and use multi-cut instead. Even though the moves in multi-cut really are "real moves" I would not count them either because they inside the blackbox of threat detection. For example the very best move might not give you the cut-off but then when it's played in the main search it could. So if you use multi-cut with 3 moves would you honestly count the first move in the main search as the 4th move?

We are getting hung up on the unimportant detail of whether the null move is a real move or not instead of respecting the fact that it's a threat detection device with many possible implementations. Note that null move pruning is not even sound from a computer science point of view. It's practical and it boost ELO in some games but it's provably not correct.

For a practical consideration one would have to determine if this implementation actually gives you superior data. I know that with perfect move ordering the score of my test (the average move number where a cutoff occurs IF a cutoff occurs) will always comes to exactly 1. If we decide to count the first real move as 2 if the null move fails, I don't think it improves the actual measurement. Do you actually believe this gives a better overall picture? Please note that regardless of how good the first move is, you WILL not get a cutoff with null move if there is large enough threat against you but you will if this is a cut node on the first move. So do you still think this is a superior way to measure the quality of the move ordering by counting the first move as the second move in cases where there is a threat?

hgm · Post by **hgm** » Mon May 27, 2013 7:12 pm

Don wrote:That would make the hash table move, if it exists the SECOND move which gives this very odd situation that even if you have PERFECT move ordering, you would still be getting cutoffs on the SECOND move.

Doesn't sound very perfect to me, if you waste time on the null move. The PERFECT move ordering would search the hash move first, in cases where the null move would fail low.

In practice searching the hash move before the null move is not as bad as it might seem. Because as long as there were no earlier visits, or in all earlier visits the null move did cause the cutoff, there would not be any hash move. So you would search null move until it turns out that it is no no good, and then another move might safe the day, and will become hash move. It doesn't sound crazy to start with that move next time, in stead of with a null move that already failed you before.

I think the only valid point you have is that it might not be fair to count heavily reduced moves the same as unreduced moves. Whether they are part of a threat-detection device or of the legal-move tree is immaterial. With multi-probe cut you can also waste a lot of time on doing probes with poor move ordering.

Don · Post by **Don** » Mon May 27, 2013 7:15 pm

hgm wrote:
Don wrote:That would make the hash table move, if it exists the SECOND move which gives this very odd situation that even if you have PERFECT move ordering, you would still be getting cutoffs on the SECOND move.
Doesn't sound very perfect to me, if you waste time on the null move. The PERFECT move ordering would search the hash move first, in cases where the null move would fail low.

We might take a cutoff on the hash table score before doing the null move, but searching the hash table move is a lot more expensive than doing the threat detection. So I doubt that searching the hash move first is going to be very good but I cannot say I have tried this.

In practice searching the hash move before the null move is not as bad as it might seem. Because as long as there were no earlier visits, or in all earlier visits the null move did cause the cutoff, there would not be any hash move. So you would search null move until it turns out that it is no no good, and then another move might safe the day, and will become hash move. It doesn't sound crazy to start with that move next time, in stead of with a null move that already failed you before.

hgm · Post by **hgm** » Mon May 27, 2013 7:20 pm

Don wrote:We might take a cutoff on the hash table score before doing the null move, but searching the hash table move is a lot more expensive than doing the threat detection. So I doubt that searching the hash move first is going to be very good but I cannot say I have tried this.

Whether it would be very good or not is no reason to exclude it from counting. I am sure you will agree it would also not be very good to not search the hash move before all other real moves. Yet you objected against not counting the hash move...

Don · Post by **Don** » Mon May 27, 2013 7:47 pm

hgm wrote:
Don wrote:We might take a cutoff on the hash table score before doing the null move, but searching the hash table move is a lot more expensive than doing the threat detection. So I doubt that searching the hash move first is going to be very good but I cannot say I have tried this.
Whether it would be very good or not is no reason to exclude it from counting. I am sure you will agree it would also not be very good to not search the hash move before all other real moves. Yet you objected against not counting the hash move...

You seem more interested in mincing words or arguing over meaningless semantic distinctions than you helping find an accurate method for measuring move ordering efficiency. Anyone interest in this at all is probably mostly concerned with the history heuristic and the sorting of capture moves and such.

So what we are discussing here is how to get accurate instrumentation of move ordering efficiency, not debating our personal views of whether the "null" move should rightfully be considered a real move or not or how it should be ordered in the move list.

By considering it part of measurement you have to deal with nasty ambiguities such as how to deal with those who do something else in place of null move for threat detection. That is automatically and artificially going to make null move look like horrible move ordering - because it will fail a lot even at cut nodes and add 1 to the move count for all the rest of the moves even when a cutoff is possible.

99% of the time the null move is the WORST move you can play, assuming you consider it a real move as you do. So how stupid to play the worst move first? I don't want to get ugly here but it has no part in this type of instrumentation - it's just not relevant and in fact it's misleading. It generally has a non-admissible reduced depth search behind it, it's based on position that does not even naturally arise from the base position and as such it has no place here.

The hash move has little in common with the null move. Why don't you argue that the killers should not be counted as a move ordering device?

Rein Halbersma · Post by **Rein Halbersma** » Mon May 27, 2013 7:49 pm

hgm wrote:I never was able to make any sense of move-ordering statistics until I started taking them by depth remaining. At d=2 you will see completely different stats than at d=8, for instance.

I am not sure why the ordering would be more important at higher depth. In the end the number of leaves is just the product of the branching factor at each level, so it seems unimportant where you achieve the reduction. Or is it that at large depth a better order also causes more depth reduction?

Perhaps it has something to do with the percentage of nodes that need internal iterative deepening to find a hash move? Depending on your hash replacement, shallow remaining depth nodes might need it more often than nodes higher up in the tree.

bob · Post by **bob** » Mon May 27, 2013 7:50 pm

Don wrote:
michiguel wrote:
hgm wrote:I see no reason to count the null move as zero. It is a move that you search. Whether it is legal or not doesn't affect the tree size.

The count it as one, but it should be counted. That is my point. If you have a program without nullmove, it would appear that the move ordering is better since most likely any move that is tried first will get you a cut off. Not taking into account these nodes, penalizes null move programs.

Miguel

There is also no requirement to search the null move first. That is also a programmers decision.

If you can order the real moves such that you can get more cutoffs from the null move, it will shrink the tree much more than when you order the real moves to get an equal increase of the number of cutoffs of the first real move.
Miguel,

When the discussion of move ordering comes up it's all about how the moves are sorted, not some pre-test for pruning. If there is an attack, then the null move is going to fail and we are not judging the quality of the artificially produced "null move", so it's not an issue and not tied to the move generation. That would make the hash table move, if it exists the SECOND move which gives this very odd situation that even if you have PERFECT move ordering, you would still be getting cutoffs on the SECOND move. This is a big part of the reason that the NULL move is NOT a real move. In fact, as I have stated in other posts, it's only a side-effect of some methods used for determining if there is an attack. Null move is really threat move detection and is a black box.

If we extended this to quies, where we sometimes take a "stand pat" cutoff based on the score, do we count the standpat cutoff as a "null" move? The only difference in null move and stand pat in quies is that we are more fussy and that most program implementations don't actually play a "null move" in quies. I would hate to make this semantic and arbitrary distinction.

Some programs don't use null move pruning and use multi-cut instead. Even though the moves in multi-cut really are "real moves" I would not count them either because they inside the blackbox of threat detection. For example the very best move might not give you the cut-off but then when it's played in the main search it could. So if you use multi-cut with 3 moves would you honestly count the first move in the main search as the 4th move?

We are getting hung up on the unimportant detail of whether the null move is a real move or not instead of respecting the fact that it's a threat detection device with many possible implementations. Note that null move pruning is not even sound from a computer science point of view. It's practical and it boost ELO in some games but it's provably not correct.

For a practical consideration one would have to determine if this implementation actually gives you superior data. I know that with perfect move ordering the score of my test (the average move number where a cutoff occurs IF a cutoff occurs) will always comes to exactly 1. If we decide to count the first real move as 2 if the null move fails, I don't think it improves the actual measurement. Do you actually believe this gives a better overall picture? Please note that regardless of how good the first move is, you WILL not get a cutoff with null move if there is large enough threat against you but you will if this is a cut node on the first move. So do you still think this is a superior way to measure the quality of the move ordering by counting the first move as the second move in cases where there is a threat?

My take on this is we are having a tempest in a tea pot. Move ordering is about one thing, and one thing only, searching a move first that will produce a cutoff when a cutoff is going to happen. At ALL nodes, ordering is irrelevant. As are all the pruning decisions, null-move, reductions, extensions and such. Those are all attempts to shape the tree in a favorable way so that you see deeper when it is important, less deeply when it is not. But move ordering is STILL just simple move ordering.

NULL move really can't be searched anywhere but first. That's the point. You expect to fail high, and you use a null-search to reduce the effort to produce the cutoff, as opposed to searching the first move normally (higher effort).

What one does to reduce the size of the tree is a completely separate issue from move ordering. Move ordering ought to consider nothing other than actual moves searched, real chess moves only. Getting the right move first is not the same as getting the best move first. This was pointed out when ETC was first used. You want a cutoff with minimal effort. Not necessarily the best move. But all we can measure is "does the first move cause a cutoff when it should?"

Adding all the other stuff just clouds the basic issue. I agree that comparisons between programs is difficult. Just as comparing depth makes no sense nowadays.

However, it is still an important measure to determine the effectiveness of move ordering once one gets past the tricks and stuff and actually has to start searching real moves in a normal way.

petero2 · Post by **petero2** » Wed May 29, 2013 9:18 pm

petero2 wrote:
Don wrote:
petero2 wrote:
Don wrote:So I propose we try just averaging the number of moves we had to play to get a cutoff - if no cutoff we don't average anything.

Anyone game?
I get this:
Code: Select all
pos  cutoffs     tries     bf  move  bm
1   10233881  12335221  1.205  exd4  Rxb4
2    7713263   9606398  1.245  O-O   O-O
3    7540416   8878716  1.177  Be2   Bxe4
4    7178019   9486906  1.322  Bxg6  Bxg6
5    8093288   9510076  1.175  b4    b4
6   10000027  11317711  1.132  f3    f3
7   13452869  15645366  1.163  Kg3   Kg3
8   13682823  15647401  1.144  Kg3   Kg3
9    8559713   9923587  1.159  c5    c5
10   8891653  10324336  1.161  Be2   Nxe6
I hope this means that Komodo's move ordering could be improved. This would be really good news for me. It looks like your numbers are better than mine. But somehow I expect that it will not turn out that way. Probably some detail of how pruning or LMR is done will make this invalid for comparison from one program to another.
It's hard to tell what is causing this. One possible explanation (just a guess) is that my program, being much weaker than your program, spends more time at uninteresting parts of the search tree, where good move ordering might be easier to achieve.

I disabled razoring, futility pruning and reverse futility pruning in my program and repeated the test:

Code: Select all

pos  cutoffs firstMove    fh%  move  bm
1   12870889  12038268  93.53  Rxb4  Rxb4
2   11787485  11353698  96.32  O-O   O-O 
3   12082633  11727666  97.06  Bxe4  Bxe4
4   11200972  10538521  94.09  Bxg6  Bxg6
5   11282575  10794627  95.68  b4    b4	 
6   12586754  12090930  96.06  f3    f3	 
7   15946205  14852428  93.14  Kg3   Kg3 
8   15371962  14490620  94.27  Kg3   Kg3 
9   11617640  11104048  95.58  c5    c5	 
10  12167017  11742638  96.51  Nxe6  Nxe6

This supports the theory that examining uninteresting nodes improves the "fail high on first move" percentage, which could also explain why stronger programs had lower percentages in this test: http://www.top-5000.nl/moresu.htm

Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest

Re: Move ordering contest