Interesting null-move test

hgm · Post by **hgm** » Fri Mar 26, 2010 1:51 pm

metax wrote:edit: In KRPKR the opponent would, of course, never sac his rook for the pawn so that example is unrealistic.

Of course he would, when the Pawn promotes...

metax · Post by **metax** » Fri Mar 26, 2010 2:09 pm

hgm wrote:
metax wrote:edit: In KRPKR the opponent would, of course, never sac his rook for the pawn so that example is unrealistic.
Of course he would, when the Pawn promotes...

Ok, when the pawn promotes. But as I pointed out before, KRK is winnable also with null-move, even though it takes a few moves longer to mate.

hgm · Post by **hgm** » Fri Mar 26, 2010 2:34 pm

This is really amazing. Perhaps hash grafting saves the day. I don't see how a pure search could ever come up with anything.

Desperado · Post by **Desperado** » Fri Mar 26, 2010 2:42 pm

hgm wrote:No way you would gain much from this. So close to the leaves the null move cannot buy you much reduction anymore. And even if it would significantly reduce the number of nodes you need, you get that at the expense of the score being unreliable, because you would have missed all zugzwangs. If that helps, you migt as well return a random score without any search, if you get into a position where you would normally switch off null move.

Ok, i can follow you.

however...

1. assuming we can reduce the number of nodes significantly
(because i did not mention how close to the leaves we are,
and further assuming we have a high _drop in_ rate).

2. the evaluation will never act like a complete random generator
(so evaluation can assure at least a _minimal_ progress)

3. other abilities of the search like hashing and repetition detection
may help out of the worst case scenarios.

so for now, i like to keep believing that plain-nullmove can gain
plies from midgame-endgame transitions.
And even reached a late endgame like KrK, because of point 2+3
i dont see _forced_ problems (it only _can_ be a problem i think).
maybe crafty search/evaluation is smart enough to handle this.

(That would be at least one (possible?) explanation beside a truly flawed test)

Michael

bob · Post by **bob** » Fri Mar 26, 2010 8:17 pm

hgm wrote:Let me get ths straight:

You use plain null move, no verification? So if you end up in KRK, it is now a draw. And if you end up in things like KRPKR it becomes almost impossible to win, as the opponent would simply sac his Rook for the remaining Pawn, and you would not know how to checkmate him with a Rook up.

why would that be? It finds KR vs K checkmates trivially. I just tried it in a simple position with kings and rook in center of the board. The evaluation knows to drive the losing king to the edge of the board, where it finds mate quick enough...

Most won Pawn endings would also be bungled; KPK is no longer a win if the passer is not out of reach.

You are focusing on positions, I am focusing on entire games. All I can say at the moment is that null-everywhere is no worse than restricting it to only being used when the side on move has a piece or move. I can run the match to 1/4 million games to get a more accurate Elo measurement, which I will start right now. But it certainly is within a couple of Elo either way.

Seems to me that this should cost you quite a bit more than 2 Elo. Rook endings are very common (> 10% ?), and not being able to win any of those should at least cost you a few percent in score, and each % corrsponds to 7 Elo. So a 20-50 Elo drop would have to be expected.

The test must be flawed somehow...

The rook issue is not an issue apparently. Anyone can take current Crafty source, go to search.c, remove that single restriction on when null-move search is done (here is the code before/after):

if (do_null && alpha == beta - 1 && depth > 1 && !tree->inchk[ply] &&
TotalPieces(wtm, occupied)) {

if (do_null && alpha == beta - 1 && depth > 1 && !tree->inchk[ply]) {

Pretty obvious what was removed. Nothing else was changed. Same positions, same hardware, etc. I ran the above test twice and got within 2 Elo the second time around (second time was 1 elo better than the first run, not significant)... Can't say much more other than this appears to be another episode of "myth-busters"...

bob · Post by **bob** » Fri Mar 26, 2010 8:18 pm

hgm wrote:In middle-game positions all the versions use null move, there is no difference there. So how could you gain anything back?

What about where you are trading down so that out in the tree you use null-move more without the restriction. In the early stages of the game it is not so important, but for every piece that comes off, you get closer to the point where one stops doing nulls while the other does not. Clearly there is something to offset the complete inability to deal with zugzwang.

bob · Post by **bob** » Fri Mar 26, 2010 8:34 pm

hgm wrote:No way you would gain much from this. So close to the leaves the null move cannot buy you much reduction anymore. And even if it would significantly reduce the number of nodes you need, you get that at the expense of the score being unreliable, because you would have missed all zugzwangs. If that helps, you migt as well return a random score without any search, if you get into a position where you would normally switch off null move.

Things are not always what they seem.

A few examples (searched to fixed depth varying by position to produce comparable search times from position to position).

Code: Select all

position         normal null time           null everywhere time
kopec 23               29.81                                       13,67
kopec 24               28.68                                       38.94
Belle.81                 55.00                                         9.85
wac2                      20.67                                         9.65
wac230                 16.97                                         2.81
ACC4                    37.58                                         1.81

I assume everyone can find the kopec and wac positions I tried. ACC4 is a mate in 10 from the book advances in computer chess 4, that starts around move 14 in the game (human GM vs computer, 1960-era game I believe). Belle.81 is a position from Cray Blitz vs Belle, ACM tournament, 1981 that still has queens on, but the game can reach an endgame easily. It is a forced draw if white sacs a bishop.

There is a pretty significant gain. I just picked the above at random. One of them was worse with the null-everywhere, the rest were better.

Not that any of this matters at all, when one is testing with 30,000 games and using that. Positions don't matter if they don't come up often, while gaining a ply here or there can more than offset zugzwang stupidities...

bob · Post by **bob** » Fri Mar 26, 2010 8:40 pm

Uri Blass wrote:
hgm wrote:Let me get ths straight:

You use plain null move, no verification? So if you end up in KRK, it is now a draw. And if you end up in things like KRPKR it becomes almost impossible to win, as the opponent would simply sac his Rook for the remaining Pawn, and you would not know how to checkmate him with a Rook up.

Most won Pawn endings would also be bungled; KPK is no longer a win if the passer is not out of reach.

Seems to me that this should cost you quite a bit more than 2 Elo. Rook endings are very common (> 10% ?), and not being able to win any of those should at least cost you a few percent in score, and each % corrsponds to 7 Elo. So a 20-50 Elo drop would have to be expected.

The test must be flawed somehow...
I think that it is not clear that Crafty cannot win KRK with null move turned on always.

I can imagine the following possibilities
1)tablebases are used so Crafty wins easily.

2)Even without tablebases the correct moves have a threat(the question if they have a threat or not is dependent on the evaluation function) so they are not pruned.

I tried it, without tablebases, and it wins trivially. KP vs K is more of a problem, however.

But the point here is that while you lose in zugzwang-heavy positions sometimes (that is, you draw rather than win), you do gain elsewhere. It appears from testing so far that the gains and losses are almost perfectly equivalent. Surprised me. So I reported the results. Very much like the discussion where if you play two games, and let each program search one more node in each move on the second game, the games will not be identical. hard to believe. Many scoffed. Until more than one brave soul tried it and said "I'll be damned..."

I was not expecting this. I had a more restrictive limit on nulls in crafty versions thru 23.2. I posted the code from current 23.3 above (a one line change) so you can see that I relaxed the null-move restriction quite a bit. And it was a few elo better. I then decided that "since the curve is headed up, why not eliminate that restriction completely and see what happens?" The answer was, unfortunately, "no change". But it suggests that there might be a better way to deal with this stuff.

I have tested verification several times. Always slightly worse. But I have not tested it with this approach, which might produce something. More when that test is done.

bob · Post by **bob** » Fri Mar 26, 2010 8:42 pm

hgm wrote:
Uri Blass wrote:I think that it is not clear that Crafty cannot win KRK with null move turned on always.

I can imagine the following possibilities
1)tablebases are used so Crafty wins easily.
Well, that is what I define as 'flawed'. If you want to study the effect of not switching off null move, and then bias the sampling to exclude most cases where you would have to switch off null move, by looking them up in a tablebase rather than searching them with null move on, the test was meaningless from the beginning.

The same would hold if the opponents would resign in KRK, not knowing that Crafty isn't able to win that anymore.

2)Even without tablebases the correct moves have a threat(the question if they have a threat or not is dependent on the evaluation function) so they are not pruned.
Unless white finds a quick mate by sheer accident in one of the first lines it searches, it will never be able to improve on these lines. It is almost always possible fr black to do better (i.e. not being driven towards the edge/corner so fast) by injecting some null moves in zugzwang positions. So no matter how hard white tries, he will never be able to find an improvement on a line he already has. Even when he tries the correct moves, black will refute them by cheating.

In my cluster testing, as always, nobody resigns, nobody uses endgame tables except for the programs that insist on kpk (and crafty does not use even that).

jwes · Post by **jwes** » Fri Mar 26, 2010 9:28 pm

bob wrote:
hgm wrote:Let me get ths straight:

You use plain null move, no verification? So if you end up in KRK, it is now a draw. And if you end up in things like KRPKR it becomes almost impossible to win, as the opponent would simply sac his Rook for the remaining Pawn, and you would not know how to checkmate him with a Rook up.
why would that be? It finds KR vs K checkmates trivially. I just tried it in a simple position with kings and rook in center of the board. The evaluation knows to drive the losing king to the edge of the board, where it finds mate quick enough...

Since the standard method of checkmating with KR v k involves repeated zugzwangs, it is reasonable (but apparently wrong) to assume that a search with null moves would not find these mates.

bob wrote:

Most won Pawn endings would also be bungled; KPK is no longer a win if the passer is not out of reach.
You are focusing on positions, I am focusing on entire games. All I can say at the moment is that null-everywhere is no worse than restricting it to only being used when the side on move has a piece or move. I can run the match to 1/4 million games to get a more accurate Elo measurement, which I will start right now. But it certainly is within a couple of Elo either way.

Seems to me that this should cost you quite a bit more than 2 Elo. Rook endings are very common (> 10% ?), and not being able to win any of those should at least cost you a few percent in score, and each % corrsponds to 7 Elo. So a 20-50 Elo drop would have to be expected.

The test must be flawed somehow...
The rook issue is not an issue apparently. Anyone can take current Crafty source, go to search.c, remove that single restriction on when null-move search is done (here is the code before/after):

if (do_null && alpha == beta - 1 && depth > 1 && !tree->inchk[ply] &&
TotalPieces(wtm, occupied)) {

if (do_null && alpha == beta - 1 && depth > 1 && !tree->inchk[ply]) {

Pretty obvious what was removed. Nothing else was changed. Same positions, same hardware, etc. I ran the above test twice and got within 2 Elo the second time around (second time was 1 elo better than the first run, not significant)... Can't say much more other than this appears to be another episode of "myth-busters"...

It certainly could be that gains in the middlegame balance losses in the endgame. It would be interesting to run a test starting with endgame positions to see if the results are different.

Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test

Re: Interesting null-move test