Smooth scaling stockfish

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

jdart
Posts: 4366
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: Smooth scaling stockfish

Post by jdart »

As far as I can tell the standard (non smooth scaling) Stockfish already did "adaptive null pruning" as described by Heinz, which is variable reduction based on depth, as also done in Romichess, Olithink and other engines. The smooth scaling bit adds an eval dependent factor and fractional depth reduction, which are the new parts. Worth a try certainly. Personally I've tried so far R=2.5 (fixed), R=3 (fixed) and adaptive R=2/R=3 and presently use the 3rd of these.

--Jon
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Smooth scaling stockfish

Post by Milos »

I want to congratulate to Dann. It's not 150elo of course, but the idea is brilliant. Just implemented it in Robbo, and it looks like 5% additionally.
Great job!
guillef
Posts: 12
Joined: Wed Dec 30, 2009 4:52 pm

Re: Smooth scaling stockfish

Post by guillef »

Hi!

I'm testing Dann Corbit's idea in Chronos null move code.

With a small and fast tourney I get next results so far:

1: Chronos release 31,0/52
2: Chronos 1.9.7 opt 21,0/52

Time controls at 1 min for game, under Arena.

Seems that this improvement must be tested seriously!

Great job!
/gf
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Smooth scaling stockfish

Post by Don »

bob wrote:
Uri Blass wrote:
zamar wrote:+150elo???

If we can get even close to this, I think that this is really great news for whole chess community!!

Code: Select all

		double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
Balancing this equation is perhaps the next challenge? I need to think about this...
I think that it may be interesting to balance this equation also with
replacing approximateEval = quick_evaluate(pos) by something that is closer to the real evaluation or the real evaluation.

Even if evaluate(pos) is too expensive to calculate then it is not expensive to calculate average difference between quick_evaluate() and evaluate()
for the cases that evaluate() is being calculated and later use the average value for better approximation.

Uri
I find this _really_ hard to swallow myself. If I turn null-move completely off, it is an 80 Elo reduction over 40,000 games. I find it difficult to imagine how any scaling is going to make the idea 2x _more_ efficient.

Unfortunately, I am still waiting on the A/C repairs to happen and can't test this, but once the cluster comes up I can discover exactly what effect this has on Stockfish since I already use it in my cluster testing.
There is no chance this is even a 20 ELO improvement at reasonable time controls. I would like to know how many games were run to claim 150 ELO.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Smooth scaling stockfish

Post by mcostalba »

Don wrote: There is no chance this is even a 20 ELO improvement at reasonable time controls.
This is very possible, although I am asking myself what is the meaning of a null search at very high depths ?

I mean, if, for instance, I am at depth 7 and do a null search at depth 3 I can see a meaning, but if I am at depth 20 and I do a null search at depth 16 what is the meaning ?

How is possible that a missed tempo can have an influece 16 plies later ? For what I have understood of null move, I think null search is something that can give the clue if a position is under threat, but a threat at 5-10 plies later.

I don't understand how null move can give you a clue position is under threath at a 20 plies search.

At such big depths null search it becomes a kind of LMR....
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: Smooth scaling stockfish

Post by Michael Sherwin »

mcostalba wrote:
Don wrote: There is no chance this is even a 20 ELO improvement at reasonable time controls.
This is very possible, although I am asking myself what is the meaning of a null search at very high depths ?

I mean, if, for instance, I am at depth 7 and do a null search at depth 3 I can see a meaning, but if I am at depth 20 and I do a null search at depth 16 what is the meaning ?

How is possible that a missed tempo can have an influece 16 plies later ? For what I have understood of null move, I think null search is something that can give the clue if a position is under threat, but a threat at 5-10 plies later.

I don't understand how null move can give you a clue position is under threath at a 20 plies search.

At such big depths null search it becomes a kind of LMR....
My thinking is this:

There are deep tactics including mating attacks that are not limited in anyway by depth. A deeper Null Move Search can find the deep tactics that a too shallow NMS will not find. An R in the range of 2 to 3 makes for a 'really fast look-see' that does not sacrifice tactical depth as the ply that are gained to the search make up for that. When R starts to get larger than 3 the additional time saved is less important and the tactics missed are more important as it becomes harder to get extra ply out of the search. If there is some value in Dann's idea it is most likely to be found in the delta term. There will just simply be more moves on average to defend a position with a high delta and less moves to bust it making the reduction safer.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Smooth scaling stockfish

Post by Uri Blass »

Don wrote:
bob wrote:
Uri Blass wrote:
zamar wrote:+150elo???

If we can get even close to this, I think that this is really great news for whole chess community!!

Code: Select all

		double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
Balancing this equation is perhaps the next challenge? I need to think about this...
I think that it may be interesting to balance this equation also with
replacing approximateEval = quick_evaluate(pos) by something that is closer to the real evaluation or the real evaluation.

Even if evaluate(pos) is too expensive to calculate then it is not expensive to calculate average difference between quick_evaluate() and evaluate()
for the cases that evaluate() is being calculated and later use the average value for better approximation.

Uri
I find this _really_ hard to swallow myself. If I turn null-move completely off, it is an 80 Elo reduction over 40,000 games. I find it difficult to imagine how any scaling is going to make the idea 2x _more_ efficient.

Unfortunately, I am still waiting on the A/C repairs to happen and can't test this, but once the cluster comes up I can discover exactly what effect this has on Stockfish since I already use it in my cluster testing.
There is no chance this is even a 20 ELO improvement at reasonable time controls. I would like to know how many games were run to claim 150 ELO.
Dann Corbit already replied
30 games against previous version

http://talkchess.com/forum/viewtopic.ph ... ht=#313161

Dann Corbit only said that he got 150 elo in his testing and not that it is 150 elo stronger.

He only did not write originally how many games he used in his testing
and it seems that this strategy was very useful to convince other people to test it.

Uri
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: Smooth scaling stockfish

Post by Uri Blass »

mcostalba wrote:
Don wrote: There is no chance this is even a 20 ELO improvement at reasonable time controls.
This is very possible, although I am asking myself what is the meaning of a null search at very high depths ?

I mean, if, for instance, I am at depth 7 and do a null search at depth 3 I can see a meaning, but if I am at depth 20 and I do a null search at depth 16 what is the meaning ?

How is possible that a missed tempo can have an influece 16 plies later ? For what I have understood of null move, I think null search is something that can give the clue if a position is under threat, but a threat at 5-10 plies later.

I don't understand how null move can give you a clue position is under threath at a 20 plies search.

At such big depths null search it becomes a kind of LMR....
Considering the fact that null move is recursive and you also have recursive late move reductions depth 16 does not mean 16 plies later
and there are also threats that you need many plies to see(for example you may sacrifice a rook in the endgame for some passed pawns when you need many plies to see that they win against the rook)

Uri
pkappler
Posts: 38
Joined: Thu Mar 09, 2006 2:19 am

Re: Smooth scaling stockfish

Post by pkappler »

Don wrote:
bob wrote:
Uri Blass wrote:
zamar wrote:+150elo???

If we can get even close to this, I think that this is really great news for whole chess community!!

Code: Select all

		double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
Balancing this equation is perhaps the next challenge? I need to think about this...
I think that it may be interesting to balance this equation also with
replacing approximateEval = quick_evaluate(pos) by something that is closer to the real evaluation or the real evaluation.

Even if evaluate(pos) is too expensive to calculate then it is not expensive to calculate average difference between quick_evaluate() and evaluate()
for the cases that evaluate() is being calculated and later use the average value for better approximation.

Uri
I find this _really_ hard to swallow myself. If I turn null-move completely off, it is an 80 Elo reduction over 40,000 games. I find it difficult to imagine how any scaling is going to make the idea 2x _more_ efficient.

Unfortunately, I am still waiting on the A/C repairs to happen and can't test this, but once the cluster comes up I can discover exactly what effect this has on Stockfish since I already use it in my cluster testing.
There is no chance this is even a 20 ELO improvement at reasonable time controls. I would like to know how many games were run to claim 150 ELO.
I'm amazed more people haven't been asking this question. Somewhere in this thread, Dann commented that his +150 ELO estimate was based on a grand total of 30 games. :shock:

-Peter
pkappler
Posts: 38
Joined: Thu Mar 09, 2006 2:19 am

Re: Smooth scaling stockfish

Post by pkappler »

Uri Blass wrote:
Don wrote:
bob wrote:
Uri Blass wrote:
zamar wrote:+150elo???

If we can get even close to this, I think that this is really great news for whole chess community!!

Code: Select all

		double r = 0.18 * ddepth + 3.1 + log(delta)/5.0;
Balancing this equation is perhaps the next challenge? I need to think about this...
I think that it may be interesting to balance this equation also with
replacing approximateEval = quick_evaluate(pos) by something that is closer to the real evaluation or the real evaluation.

Even if evaluate(pos) is too expensive to calculate then it is not expensive to calculate average difference between quick_evaluate() and evaluate()
for the cases that evaluate() is being calculated and later use the average value for better approximation.

Uri
I find this _really_ hard to swallow myself. If I turn null-move completely off, it is an 80 Elo reduction over 40,000 games. I find it difficult to imagine how any scaling is going to make the idea 2x _more_ efficient.

Unfortunately, I am still waiting on the A/C repairs to happen and can't test this, but once the cluster comes up I can discover exactly what effect this has on Stockfish since I already use it in my cluster testing.
There is no chance this is even a 20 ELO improvement at reasonable time controls. I would like to know how many games were run to claim 150 ELO.
Dann Corbit already replied
30 games against previous version

http://talkchess.com/forum/viewtopic.ph ... ht=#313161

Dann Corbit only said that he got 150 elo in his testing and not that it is 150 elo stronger.
Dann wrote this:
"I get about +150 Elo so far in my testing."

He clearly means a 150 ELO improvement.
He only did not write originally how many games he used in his testing and it seems that this strategy was very useful to convince other people to test it.

Uri
Indeed.

-Peter