I might eb missing somehting, but isn't depth in the program really fraction plies (so one regular ply is 2 half plies). So you would get depth=10 when there are 5 ply left in the "full width" part of the search. Also, you multiply your calulate reduction by 2, forcing increments of one full ply. I would assume if this idea works, using half ply values would be even better.Dann Corbit wrote:Consider the following:
#ifdef SMOOTH_REDUCTION
double delta = approximateEval - beta;
delta = max(delta, 1.0);
double ddepth = double(depth);
double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
r = r > ddepth ? ddepth : r;
int R = int(r);
#else
// Null move dynamic reduction based on depth
int R = (depth >= 5 * OnePly ? 4 : 3);
// Null move dynamic reduction based on value
if (approximateEval - beta > PawnValueMidgame)
R++;
#endif
If depth = 5, then 0.18 * 5 = 0.9
So 0.9 + 3.1 gives 4.0 which is the same as the original formula.
In value.h, we find this:
const Value PawnValueMidgame = Value(0x0C6);
In decimal, C6 is 198.
log (base e) of 198 is 5.288
dividing 5.288 by 5 gives 1 (approximately).
So, for the points at the boundary conditions of the old formula, my curve gives the same answers (or answers pretty nearly). When you fit this idea to your code, I suggest that you calibrate the curve accordingly, especially if you have carefully weighed out your null move reduction constants by experiment.
The difference is that when we bend away from these answers the solution I propose slowly changes so that instead of slamming the door, we are slowly closing it.
The reason I came up with smooth scaling is that null move's abrupt nature really bothered me. Why do we reduce the same in a 30 ply search as we do in a 12 ply search? That does not make sense. Why do we reduce the same with a full queen advantage as with a knight advantage? That also seemed very strange. So I just did a simple extrapolation to smooth out the reductions in order that they made sense to me.
Smooth scaling -- an explanation
Moderators: hgm, Dann Corbit, Harvey Williamson
-
mjlef
- Posts: 1494
- Joined: Thu Mar 30, 2006 2:08 pm
Re: Smooth scaling -- an explanation
-
Dann Corbit
- Posts: 12482
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Smooth scaling -- an explanation
I tried half plies. Some people get better results, some worse.mjlef wrote:I might eb missing somehting, but isn't depth in the program really fraction plies (so one regular ply is 2 half plies). So you would get depth=10 when there are 5 ply left in the "full width" part of the search. Also, you multiply your calulate reduction by 2, forcing increments of one full ply. I would assume if this idea works, using half ply values would be even better.Dann Corbit wrote:Consider the following:
#ifdef SMOOTH_REDUCTION
double delta = approximateEval - beta;
delta = max(delta, 1.0);
double ddepth = double(depth);
double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
r = r > ddepth ? ddepth : r;
int R = int(r);
#else
// Null move dynamic reduction based on depth
int R = (depth >= 5 * OnePly ? 4 : 3);
// Null move dynamic reduction based on value
if (approximateEval - beta > PawnValueMidgame)
R++;
#endif
If depth = 5, then 0.18 * 5 = 0.9
So 0.9 + 3.1 gives 4.0 which is the same as the original formula.
In value.h, we find this:
const Value PawnValueMidgame = Value(0x0C6);
In decimal, C6 is 198.
log (base e) of 198 is 5.288
dividing 5.288 by 5 gives 1 (approximately).
So, for the points at the boundary conditions of the old formula, my curve gives the same answers (or answers pretty nearly). When you fit this idea to your code, I suggest that you calibrate the curve accordingly, especially if you have carefully weighed out your null move reduction constants by experiment.
The difference is that when we bend away from these answers the solution I propose slowly changes so that instead of slamming the door, we are slowly closing it.
The reason I came up with smooth scaling is that null move's abrupt nature really bothered me. Why do we reduce the same in a 30 ply search as we do in a 12 ply search? That does not make sense. Why do we reduce the same with a full queen advantage as with a knight advantage? That also seemed very strange. So I just did a simple extrapolation to smooth out the reductions in order that they made sense to me.
I got a little worse, but I did not run enough games to really know if it was better or worse. The version I called "Smoother Scaling Stockfish" instead of "Smooth Scaling" used half plies.
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Smooth scaling -- an explanation
There I agree, but we are talking about an algorithm that adjusts "R" in null-move, which is most definitely a speculative pruning idea that gains plies _and_ introduces errors.michiguel wrote:I am not wrong, and this is why: I am saying exactly the same thing as you saybob wrote:That's simply wrong, and here's why.michiguel wrote:This is for the non-believers and deniersDann Corbit wrote:Consider the following:
#ifdef SMOOTH_REDUCTION
double delta = approximateEval - beta;
delta = max(delta, 1.0);
double ddepth = double(depth);
double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
r = r > ddepth ? ddepth : r;
int R = int(r);
#else
// Null move dynamic reduction based on depth
int R = (depth >= 5 * OnePly ? 4 : 3);
// Null move dynamic reduction based on value
if (approximateEval - beta > PawnValueMidgame)
R++;
#endif
If depth = 5, then 0.18 * 5 = 0.9
So 0.9 + 3.1 gives 4.0 which is the same as the original formula.
In value.h, we find this:
const Value PawnValueMidgame = Value(0x0C6);
In decimal, C6 is 198.
log (base e) of 198 is 5.288
dividing 5.288 by 5 gives 1 (approximately).
So, for the points at the boundary conditions of the old formula, my curve gives the same answers (or answers pretty nearly). When you fit this idea to your code, I suggest that you calibrate the curve accordingly, especially if you have carefully weighed out your null move reduction constants by experiment.
The difference is that when we bend away from these answers the solution I propose slowly changes so that instead of slamming the door, we are slowly closing it.
The reason I came up with smooth scaling is that null move's abrupt nature really bothered me. Why do we reduce the same in a 30 ply search as we do in a 12 ply search? That does not make sense. Why do we reduce the same with a full queen advantage as with a knight advantage? That also seemed very strange. So I just did a simple extrapolation to smooth out the reductions in order that they made sense to me.
First of all, it needs to be tested, but some indicated that it should not work (based on intuition and/or theoretical issues). I disagree. It is wrong to say that if null move alone gives you 80 points, tweaking null move cannot give you more than, say, 20 points. It is possible to tweak nullmove and obtain a bigger boost than the addition of the technique itself... at least in theory...
Lets say that nullmove allows you to search 3 plies deeper. 3 plies deeper may mean 300 ELO points. Why do you see an increase of only 100 points? Because the search is full of holes. So, in this case, there is 200 points to be gained by avoiding the holes. It is as important to know WHEN to do cut off by nullmove as to the reduction you get. It is possible that Dann's adjustment may cause a more careful decision making at the level of nullmove for certain engines. I could see that this could be completely dependent on the engine.
However, if you take your comments "in context" where the context is about adjusting the null-move R value, then tuning this has no possible way to get that +300 or whatever. Which was my point. If you tune the R value to get 3 plies, that is not the type of "plies" that get you 70-100 Elo each... Those are the types of plies that introduce additional pruning errors and therefore return _much_ less than that "300".Just report your depth as 2 plies deeper. Are you any stronger? Null-move easily increased the typical search depth by more than 3 plies, but gets _nowhere_ near +300. I ran the test and reported these results earlier this year. NM is worth about +120 if you do not also do LMR. If you are doing LMR, NM will only add +80. Even with those extra plies.
Exactly my point. Why you get 120 and not 300? you should get 300 (ish, let's not fight for accuracy) if you magically reach 3 more plies WITHOUT nullmove, just by magically increasing your nps or whatever. That means you have 300 for reaching deeper - 180 for all the holes in the search. We can debate that this 180 I estimate here is not 180, but you may agree that it is a big number.
Let me rephrase: it is not the classical Heinz's implementation.You are treating all plies as "equal" but they are not when there is selective pruning involved, and with NM that is exactly what is happening.
No Ego here. You can look back to when we were working on 23.0 and find posts where I tried a ton of different approaches to the adaptive null-move idea. And this _is_ about adaptive null-move, not sure why you think it is not. The idea of a non-constant R value is, by definition, adaptive null-move. How the value is altered is wide-open to variations, of course, but it is still the same _idea_.
and BTW, this is not adaptive null move, it has nothing to do with the snippet from Robbert Litto, and who the heck cares whether someone whisper a similar thought about this 5 years ago. Drop y'all EGO points and test to see whether it gives ELO points.
MiguelCouldn't agree more. That's exactly why I test _every_ idea I come up with, and most that others suggest as well... My answers are almost always based on test results, not guesswork. If I give an opinion, I make it clear it is an opinion. If I give results, they are what they are.
In experimental science you learn that ideas are meaningless. My former PhD's advisor said to me, everybody has plenty, an most likely, yours is not original. Someone else is thinking about it or already did . What matters is the execution and who finally did it. What my PhD advisor was trying to say was (he was too polite to say it directly), "get your ass to the bench and start doing the experiment"![]()
Miguel
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Smooth scaling -- an explanation
Based on what I have seen of his code, it does appear to be backward, as HGM said, which means it _lowers_ the R value in the positions where lowering the R value makes sense, even though the intended result is to actually raise the R value. However, I suspect +150 is simply wrong. But I can't confirm until we get our cluster back up.Michael Sherwin wrote:If in Stockfish the elo gain is 150 points or even half of that then your formula may be covering up a flaw in Stockfish. I however, think that an aggressive reduction that is intelligent enough can lead to a big improvement. Adding delta is increasing the intelligence of Null Move.Dann Corbit wrote:And I agree fully with both of you.
I think that the idea should be investigated heuristically. The proof of the pudding is in the eating.
I think that the constants I chose should be ignored and replaced by sensible values.
I think that the natural logarithm function should be tossed and other distance compression ideas tried. Maybe we should even expand it (just kidding).
At any rate, I think that the right way to explore the idea is to test it out and also to systematically go through different combinations of coefficients and algorithms until a superior one is found or the entire idea is tossed into the trash bin.
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Smooth scaling -- an explanation
If I fool with the search stuff, I generally test at 5+5 and 10+10, but only if the idea doesn't hurt at the usual fast time controls I use. If it hurts at blitz, I rarely try longer time controls unless intuition suggests that the tests might reveal something useful.BubbaTough wrote:Just out of curiosity, what kind of time control do you do your testing of this kind of thing on? That is what always bothered me with fooling around with these kinds of things. Its not feasible for me to really explore the effect in long games, but the effects of this genre of change might be quite different at high depths than lower depths. Thus, I have always been a little suspicious of my results.bob wrote: First, for the last two questions, the answer is because null-move is not doing exactly what you appear to think it is doing. It is a quick way to bail out of positions where you are so far ahead, if you give your opponent the ability to make two moves in a row, he can't catch up. This idea is unrelated to the actual material balance. Null-move helps when you start off a queen ahead, because it will only cause cutoffs if you get way _more_ than a queen ahead, so it is a relative algorithm, that is not dependent on the initial position's material balance. Most of us are not really doing null-move on a 30 ply search as that is usually a pawn ending where null-move is not so safe unless you resort to the verification search. And since testing has shown that idea to not be useful (it loses a couple of elo, in fact)
As far as deeper reductions (beyond R=3 go), I have not had any success with that, and I tested it an absolute ton before 23.0 was released. I tried linear interpolation, and exponential in both directions (tapers off rapidly or tapers off very slowly. Once I have a chance to test the old/new stockfish on the cluster, that will quickly show whether this idea is a good one or not. But it certainly isn't +150. And I'd have a hard time imagining it even being possible to reach +50 with this kind of change, since basic null-move doesn't give that big a boost by itself, and this is just an enhancement.
More when we get our A/C up so the cluster can wake up.
-Sam
I have even run some 60+60 tests where the games run at about 3 hours each, or about 256 games every 3 hours. It is slow.
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Smooth scaling -- an explanation
However, for null-move that is backward. If you are way ahead, you do _not_ want to be too aggressive, but if the material is equal, you do not want to be too passive in the R value. This is about the current state vs the best state so far, and that is what null-move is all about. It isn't about doing interesting things in equal positions, it is about dismissing hopeless positions quickly so that you do end up spending more time on the good stuff, almost by accident.Dann Corbit wrote:I am not using material balance. I am using previous search (if any) and in those cases where no search has ever been performed, it gets estimated though evaluation. The centipawn evaluations are returned through the hash table, and I only use eval if there is nothing else available.bob wrote:First, for the last two questions, the answer is because null-move is not doing exactly what you appear to think it is doing. It is a quick way to bail out of positions where you are so far ahead, if you give your opponent the ability to make two moves in a row, he can't catch up. This idea is unrelated to the actual material balance. Null-move helps when you start off a queen ahead, because it will only cause cutoffs if you get way _more_ than a queen ahead, so it is a relative algorithm, that is not dependent on the initial position's material balance. Most of us are not really doing null-move on a 30 ply search as that is usually a pawn ending where null-move is not so safe unless you resort to the verification search. And since testing has shown that idea to not be useful (it loses a couple of elo, in fact)Dann Corbit wrote:Consider the following:
#ifdef SMOOTH_REDUCTION
double delta = approximateEval - beta;
delta = max(delta, 1.0);
double ddepth = double(depth);
double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
r = r > ddepth ? ddepth : r;
int R = int(r);
#else
// Null move dynamic reduction based on depth
int R = (depth >= 5 * OnePly ? 4 : 3);
// Null move dynamic reduction based on value
if (approximateEval - beta > PawnValueMidgame)
R++;
#endif
If depth = 5, then 0.18 * 5 = 0.9
So 0.9 + 3.1 gives 4.0 which is the same as the original formula.
In value.h, we find this:
const Value PawnValueMidgame = Value(0x0C6);
In decimal, C6 is 198.
log (base e) of 198 is 5.288
dividing 5.288 by 5 gives 1 (approximately).
So, for the points at the boundary conditions of the old formula, my curve gives the same answers (or answers pretty nearly). When you fit this idea to your code, I suggest that you calibrate the curve accordingly, especially if you have carefully weighed out your null move reduction constants by experiment.
The difference is that when we bend away from these answers the solution I propose slowly changes so that instead of slamming the door, we are slowly closing it.
The reason I came up with smooth scaling is that null move's abrupt nature really bothered me. Why do we reduce the same in a 30 ply search as we do in a 12 ply search? That does not make sense. Why do we reduce the same with a full queen advantage as with a knight advantage? That also seemed very strange. So I just did a simple extrapolation to smooth out the reductions in order that they made sense to me.
As far as deeper reductions (beyond R=3 go), I have not had any success with that, and I tested it an absolute ton before 23.0 was released. I tried linear interpolation, and exponential in both directions (tapers off rapidly or tapers off very slowly. Once I have a chance to test the old/new stockfish on the cluster, that will quickly show whether this idea is a good one or not. But it certainly isn't +150. And I'd have a hard time imagining it even being possible to reach +50 with this kind of change, since basic null-move doesn't give that big a boost by itself, and this is just an enhancement.
More when we get our A/C up so the cluster can wake up.
I do not know where the idea will lead. It would not surprise me even if the sign of one or more of my coefficients were wrong. But if that is the case, then we will see even bigger improvements when it gets corrected.
By the way, I think that it is not just huge leads that need less effort but also huge deficits.
It is the positions where things are very close that we have to analyze way out to the fingertips.
-
jdart
- Posts: 4361
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Smooth scaling -- an explanation
My initial experiment with this, based on test suites, is not very encouraging:
w/adaptive null R=2/R=3 based on depth:
arasan12 125/205
ecmgcp 168/183
iq4 176/183
wac 299/300
pet 37/48
eet 59/100
with smooth scaling:
arasan12 102/205
ecmgcp 171/183
iq4 176/183
wac 298/300
pet 38/48
eet 55/100
so, it got a little worse. Note that Arasan doesn't really have an "approximate" eval, so I use the regular eval function: but that internally has lazy eval and an eval cache, so you may get an approximation used.
Note also by default Arasan does not omit null move when the static score is significantly below beta, as a number of other programs do.
I'm going to try a variant that does this too, though.
--Jon
w/adaptive null R=2/R=3 based on depth:
arasan12 125/205
ecmgcp 168/183
iq4 176/183
wac 299/300
pet 37/48
eet 59/100
with smooth scaling:
arasan12 102/205
ecmgcp 171/183
iq4 176/183
wac 298/300
pet 38/48
eet 55/100
so, it got a little worse. Note that Arasan doesn't really have an "approximate" eval, so I use the regular eval function: but that internally has lazy eval and an eval cache, so you may get an approximation used.
Note also by default Arasan does not omit null move when the static score is significantly below beta, as a number of other programs do.
I'm going to try a variant that does this too, though.
--Jon
-
bob
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Smooth scaling -- an explanation
Personally I don't "buy in" to the idea of not doing null just because the static score is significantly below beta. Ever seen a position where you are a rook down, and absolutely crushing your opponent still??? I have tried dozens of ways to not do null-move at some nodes, and not a one has ever produced anything stronger, or even as strong as just doing it everywhere, period.jdart wrote:My initial experiment with this, based on test suites, is not very encouraging:
w/adaptive null R=2/R=3 based on depth:
arasan12 125/205
ecmgcp 168/183
iq4 176/183
wac 299/300
pet 37/48
eet 59/100
with smooth scaling:
arasan12 102/205
ecmgcp 171/183
iq4 176/183
wac 298/300
pet 38/48
eet 55/100
so, it got a little worse. Note that Arasan doesn't really have an "approximate" eval, so I use the regular eval function: but that internally has lazy eval and an eval cache, so you may get an approximation used.
Note also by default Arasan does not omit null move when the static score is significantly below beta, as a number of other programs do.
I'm going to try a variant that does this too, though.
--Jon
YMMV of course, but I beat this to death earlier this year...
-
jdart
- Posts: 4361
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Smooth scaling -- an explanation
I haven't had luck with this either. And my latest test results with this idea + smooth scaling show that it is still worse, at least for me:bob wrote:Personally I don't "buy in" to the idea of not doing null just because the static score is significantly below beta.
arasan12 98/208
ecmgcp 163/183
iq4 174/183
wac 295/300
pet 37/48
eet 59/100
-
Don
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Smooth scaling -- an explanation
I have tried many different variations of this. Doing null move only when the static score >= beta, doing it always, doing it when it's close.
For my program it turns out that doing it only when the score is above beta works best. I don't really understand why many other programmers find it best to do it always but it must depend on many different details about your program.
Don
For my program it turns out that doing it only when the score is above beta works best. I don't really understand why many other programmers find it best to do it always but it must depend on many different details about your program.
Don
bob wrote:Personally I don't "buy in" to the idea of not doing null just because the static score is significantly below beta. Ever seen a position where you are a rook down, and absolutely crushing your opponent still??? I have tried dozens of ways to not do null-move at some nodes, and not a one has ever produced anything stronger, or even as strong as just doing it everywhere, period.jdart wrote:My initial experiment with this, based on test suites, is not very encouraging:
w/adaptive null R=2/R=3 based on depth:
arasan12 125/205
ecmgcp 168/183
iq4 176/183
wac 299/300
pet 37/48
eet 59/100
with smooth scaling:
arasan12 102/205
ecmgcp 171/183
iq4 176/183
wac 298/300
pet 38/48
eet 55/100
so, it got a little worse. Note that Arasan doesn't really have an "approximate" eval, so I use the regular eval function: but that internally has lazy eval and an eval cache, so you may get an approximation used.
Note also by default Arasan does not omit null move when the static score is significantly below beta, as a number of other programs do.
I'm going to try a variant that does this too, though.
--Jon
YMMV of course, but I beat this to death earlier this year...