Interesting null-move test

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Interesting null-move test

Post by bob »

I recently tweaked around a bit with null-move in Crafty. Been on my to-do list for a while. Version 23.2R01 is the best so far and does null-move everywhere except when the side on move has no pieces at all. Even with a single knight, it is currently doing null-move.

Just for fun, I decided to try null-move _everywhere_, pieces or not. So far, this is the result (23.2R05 is the null-everywhere version, R01 is the null whenever stm has a piece, others are a couple of other tests that were no real change).

Crafty-23.2R05 2646 17 17 1251 63% 2543 20%
Crafty-23.2R04-3 2633 4 4 30000 61% 2550 22%
Crafty-23.2R04-5 2633 4 4 30000 61% 2550 23%
Crafty-23.2R01-1 2632 4 4 30000 61% 2550 23%

Hard to imagine this actually being better, but after 1251 games it looks to be although the error bar is large...

current results:

Crafty-23.2R05 2637 12 12 2510 61% 2550 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

bob wrote:I recently tweaked around a bit with null-move in Crafty. Been on my to-do list for a while. Version 23.2R01 is the best so far and does null-move everywhere except when the side on move has no pieces at all. Even with a single knight, it is currently doing null-move.

Just for fun, I decided to try null-move _everywhere_, pieces or not. So far, this is the result (23.2R05 is the null-everywhere version, R01 is the null whenever stm has a piece, others are a couple of other tests that were no real change).

Crafty-23.2R05 2646 17 17 1251 63% 2543 20%
Crafty-23.2R04-3 2633 4 4 30000 61% 2550 22%
Crafty-23.2R04-5 2633 4 4 30000 61% 2550 23%
Crafty-23.2R01-1 2632 4 4 30000 61% 2550 23%

Hard to imagine this actually being better, but after 1251 games it looks to be although the error bar is large...

current results:

Crafty-23.2R05 2637 12 12 2510 61% 2550 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2633 9 9 3871 61% 2550 22%
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

bob wrote:
bob wrote:I recently tweaked around a bit with null-move in Crafty. Been on my to-do list for a while. Version 23.2R01 is the best so far and does null-move everywhere except when the side on move has no pieces at all. Even with a single knight, it is currently doing null-move.

Just for fun, I decided to try null-move _everywhere_, pieces or not. So far, this is the result (23.2R05 is the null-everywhere version, R01 is the null whenever stm has a piece, others are a couple of other tests that were no real change).

Crafty-23.2R05 2646 17 17 1251 63% 2543 20%
Crafty-23.2R04-3 2633 4 4 30000 61% 2550 22%
Crafty-23.2R04-5 2633 4 4 30000 61% 2550 23%
Crafty-23.2R01-1 2632 4 4 30000 61% 2550 23%

Hard to imagine this actually being better, but after 1251 games it looks to be although the error bar is large...

current results:

Crafty-23.2R05 2637 12 12 2510 61% 2550 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2633 9 9 3871 61% 2550 22%
Crafty-23.2R05 2635 7 7 6657 61% 2551 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

bob wrote:
bob wrote:
bob wrote:I recently tweaked around a bit with null-move in Crafty. Been on my to-do list for a while. Version 23.2R01 is the best so far and does null-move everywhere except when the side on move has no pieces at all. Even with a single knight, it is currently doing null-move.

Just for fun, I decided to try null-move _everywhere_, pieces or not. So far, this is the result (23.2R05 is the null-everywhere version, R01 is the null whenever stm has a piece, others are a couple of other tests that were no real change).

Crafty-23.2R05 2646 17 17 1251 63% 2543 20%
Crafty-23.2R04-3 2633 4 4 30000 61% 2550 22%
Crafty-23.2R04-5 2633 4 4 30000 61% 2550 23%
Crafty-23.2R01-1 2632 4 4 30000 61% 2550 23%

Hard to imagine this actually being better, but after 1251 games it looks to be although the error bar is large...

current results:

Crafty-23.2R05 2637 12 12 2510 61% 2550 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2633 9 9 3871 61% 2550 22%
Crafty-23.2R05 2635 7 7 6657 61% 2551 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2632 6 6 9922 61% 2551 22%
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

bob wrote:
bob wrote:
bob wrote:
bob wrote:I recently tweaked around a bit with null-move in Crafty. Been on my to-do list for a while. Version 23.2R01 is the best so far and does null-move everywhere except when the side on move has no pieces at all. Even with a single knight, it is currently doing null-move.

Just for fun, I decided to try null-move _everywhere_, pieces or not. So far, this is the result (23.2R05 is the null-everywhere version, R01 is the null whenever stm has a piece, others are a couple of other tests that were no real change).

Crafty-23.2R05 2646 17 17 1251 63% 2543 20%
Crafty-23.2R04-3 2633 4 4 30000 61% 2550 22%
Crafty-23.2R04-5 2633 4 4 30000 61% 2550 23%
Crafty-23.2R01-1 2632 4 4 30000 61% 2550 23%

Hard to imagine this actually being better, but after 1251 games it looks to be although the error bar is large...

current results:

Crafty-23.2R05 2637 12 12 2510 61% 2550 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2633 9 9 3871 61% 2550 22%
Crafty-23.2R05 2635 7 7 6657 61% 2551 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
Crafty-23.2R05 2632 6 6 9922 61% 2551 22%
Crafty-23.2R04-3 2634 3 3 30000 61% 2551 22%
Crafty-23.2R05 2634 4 4 20652 61% 2551 23%
Crafty-23.2R04-5 2633 3 3 30000 61% 2551 23%
Crafty-23.2R01-1 2633 3 3 30000 61% 2551 23%
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Interesting null-move test

Post by jwes »

Are you going to try verified or double null move when there are very few pieces on the board?
Also, how often do positions with 0 or 1 pieces occur in your testing?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

jwes wrote:Are you going to try verified or double null move when there are very few pieces on the board?
Also, how often do positions with 0 or 1 pieces occur in your testing?
I have tried verified null-move, although not with this "do it anywhere" option. Verified null-move is, and always has been, slightly worse with than without so I have not used it in any released version. Nor have I tried double-null.

As far as "how many times?" I have no idea. I'm only interested in the results, and so far it appears to be immaterial as to whether I restrict null to positions with at least one piece or more, or do them no matter what. I have seen lots of pawn endings however, in looking at specific games during testing, but have absolutely no idea how frequent they are. They are probably about as rare in my testing as they are in real tournament games between computers, which is to say fairly uncommon. Or it would seem at least the ones where zugzwang is an issue seems to be uncommon based on the nearly identical test results between the two versions.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: Interesting null-move test

Post by Daniel Shawul »

As far as "how many times?" I have no idea. I'm only interested in the results, and so far it appears to be immaterial as to whether I restrict null to positions with at least one piece or more, or do them no matter what.
Well R05's result was on the decline and probably ends up weakest no matter small the margin is. Also since I know pawn endings (KN* and KB* also) are indeed screwed up by null move, so I would gladly still use that condition. Even if it ended up 2 or 3 elo better i would avoid Ro5, because it really burns me when I lose games on CCT due to bad null moving in the endgame :)
I myself have tried a few of similar subtle changes in the hope that I get some hidden elo. However the reality so far is that I have to make big changes to see any improvement at that magnitude of games. Like playing with the R value which indeed got me measurable elo.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Interesting null-move test

Post by bob »

Daniel Shawul wrote:
As far as "how many times?" I have no idea. I'm only interested in the results, and so far it appears to be immaterial as to whether I restrict null to positions with at least one piece or more, or do them no matter what.
Well R05's result was on the decline and probably ends up weakest no matter small the margin is. Also since I know pawn endings (KN* and KB* also) are indeed screwed up by null move, so I would gladly still use that condition. Even if it ended up 2 or 3 elo better i would avoid Ro5, because it really burns me when I lose games on CCT due to bad null moving in the endgame :)
I myself have tried a few of similar subtle changes in the hope that I get some hidden elo. However the reality so far is that I have to make big changes to see any improvement at that magnitude of games. Like playing with the R value which indeed got me measurable elo.
Wrong way to think about it. If you play 30,000 games and it plays _better_, why on earth would you toss an idea out just because there is an occasional game it might lose due to the change? Do you want to win just a few specific games, or win the largest number possible???

that's one reason testing is critical, and then using the results is even more critical. :)

Crafty-23.2R04-3 2632 3 3 30000 61% 2549 22%
Crafty-23.2R04-5 2632 3 3 30000 61% 2549 23%
Crafty-23.2R01-1 2631 3 3 30000 61% 2549 23%
Crafty-23.2R04-0 2631 3 3 30000 61% 2549 23%
Crafty-23.2R05-1 2630 3 3 30000 60% 2549 23%

So it ended up two Elo down, with an error bar of +/-3, which says that the old idea of not using null in pawn-only endgames is not nearly as important as we have always believed when using test positions to evaluate such changes...

In real games, and a _lot_ of 'em, the effect is almost too small to measure. I need to run this test with about 1/4 million games to get down to the +/- 1 Elo error range
Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 6:44 pm
Location: Bulgaria

Re: Interesting null-move test

Post by Mincho Georgiev »

I've always wanted to do such a test, but due to lack of hardware resources,
never did. Thank you for doing it, and probably is worth to reconsider conditions like this one, and some others, that we always believed in blindly.