Reducing/Pruning Bad Captures (SEE < 0)

Mincho Georgiev · Post by **Mincho Georgiev** » Wed Aug 24, 2011 10:24 pm

I believe we all hate that kind of randomness and there is no cure for it.
One of the greatest disappointments in this field is what it is.
For the sake of completeness I even ran 1:1 executable once against itself and the most bizarre results popped up.

Don · Post by **Don** » Wed Aug 24, 2011 10:29 pm

bob wrote:
JVMerlino wrote:
bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...
Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.

jm
I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.

I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...

No explanation, just an observation.

I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!

So if a test starts out badly and we have reasonable belief that it's bad, we stop it before it's complete. If it starts out good we continue to run it to completion regardless before accepting it.

bob · Post by **bob** » Wed Aug 24, 2011 11:37 pm

Don wrote:
bob wrote:
JVMerlino wrote:
bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...
Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.

jm
I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.

I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...

No explanation, just an observation.
I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!

So if a test starts out badly and we have reasonable belief that it's bad, we stop it before it's complete. If it starts out good we continue to run it to completion regardless before accepting it.

I never stop a test early unless the elo dropped out the bottom due to a programming error as mentioned previously. You might be right, but I have a utility that continuously displays the Elo as games complete, and my perception is that things often start off high and settle down, sort of like a one-tailed normal curve, rather than the just as often expected starting low and going up to the expected value.

However, at 63, I am not going to bet that my perception is always "right". But I can say it is strange why when playing so many games, the results start off in never-never land and return.

In any case, first test completed, no elo change that can be measured with just 30K games. Got a second run going just to see if they disagree by much...

Don · Post by **Don** » Thu Aug 25, 2011 6:09 pm

zamar wrote:
Don wrote:
zamar wrote:
Don wrote: If losing capture:
reduce search using normal schedule, set scout to alpha-MARGIN

It's possible that we have already tried this. I have probably only tried about a million things in the last 3 years and don't remember all of them

Don
SF got something like +5 elo using this idea. The problem is that although engine is a bit stronger, it's much weaker in tactical puzzles (which often involve a bad capture or even two of them!).
But Joona claimed earlier that it didn't work! I would kill for 5 ELO! Are you saying that would give up 5 ELO for better tactics but a weaker program?
Earlier I was referring strictly bad capturing reducing/pruning and it didn't work. But reducing in low depths aggressively AND tweaking alpha, gave us something (take a look at SF 2.1), but the downside is (as I said) that engine became much weaker in tactics. As H.G wisely put it, if capture is futile _enough_ it can be reduced aggressively.

But still I view this as a separate thing than LMR, because we apply this technique only close to leaves, reduce much more aggressively and tweak alpha around 100cp.

We settled on a simple idea, since we put losing captures at the end of the list, we now reduce them 2 ply LESS than we would reduce a quiet move with the same move number and it seems to be worth perhaps 5 or 6 ELO.

bob · Post by **bob** » Thu Aug 25, 2011 6:54 pm

bob wrote:
Don wrote:
bob wrote:
JVMerlino wrote:
bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...
Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.

jm
I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.

I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...

No explanation, just an observation.
I don't think it's really happening, it's just that fast starts are much more memorable. We have a different standard for KEEPING a change versus stopping a test when it looks bad because we don't mind taking a chance on throwing out a change that looks bad but might be good, but we really don't want to keep a change that looks good but might be bad!

So if a test starts out badly and we have reasonable belief that it's bad, we stop it before it's complete. If it starts out good we continue to run it to completion regardless before accepting it.
I never stop a test early unless the elo dropped out the bottom due to a programming error as mentioned previously. You might be right, but I have a utility that continuously displays the Elo as games complete, and my perception is that things often start off high and settle down, sort of like a one-tailed normal curve, rather than the just as often expected starting low and going up to the expected value.

However, at 63, I am not going to bet that my perception is always "right". But I can say it is strange why when playing so many games, the results start off in never-never land and return.

In any case, first test completed, no elo change that can be measured with just 30K games. Got a second run going just to see if they disagree by much...

Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...

JVMerlino · Post by **JVMerlino** » Fri Aug 26, 2011 12:04 am

bob wrote:Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...

You mentioned that you are not reducing these moves "near the tips". What is your minimum depth for this reduction, and did you change it for this test?

jm

bob · Post by **bob** » Fri Aug 26, 2011 1:41 am

JVMerlino wrote:
bob wrote:Ok, reducing captures by 1 vs 2 made no significant change. (this is captures where SEE < 0, captures with SEE >= 0 are never reduced)...
You mentioned that you are not reducing these moves "near the tips". What is your minimum depth for this reduction, and did you change it for this test?

jm

Didn't change anything except that the reduction for losing captures is just one ply rather than 2. Here's the ideas I am currently using...

The first move at any is never reduced. The second move is reduced by 1. The rest are reduced by 2. Those were all set via a lot of testing...

Then there are some exceptions to that. If this is a passed pawn, and it is within N moves of promoting, where N varies with remaining depth, then we don't reduce to give it a chance to make progress.

I always try to leave one remaining ply after a reduction is done, so if I am right at the end of the tree I won't reduce if that would take the depth to zero and drop me into the q-search.

I simply took that 1 or 2 ply reduction value and subtracted 1 if the move was a capture in the remaining_moves phase, because that means they had a SEE < 0.

In talking about this, I probably need to look at this code carefully anyway, because there are some MVV/LVA issues as well and I need to be sure I have not introduced an inconsistency somewhere in optimizing the code for speed.

Mincho Georgiev · Post by **Mincho Georgiev** » Sun Aug 28, 2011 10:51 pm

Test 3 - completed:
-------------------------
Rank Name Elo + - games score oppo. draws
1 Fruit 1.5 100 24 24 500 57% 50 24%
2 Counter 1.2 51 22 22 500 50% 50 38%
3 Pawny 50 9 9 4000 57% -6 22%
4 Rotor 0.4 39 24 24 500 48% 50 24%
5 Gaia 3.5 64bit 30 24 24 500 47% 50 17%
6 Diablo 0.5.1 19 24 24 500 46% 50 25%
7 Muse 0.899b -25 24 24 500 40% 50 24%
8 Comet_B68 -62 25 26 500 36% 50 10%
9 Ares 1.004 -201 28 29 500 20% 50 11%

----------------------------------------------------------------------
Rank Name Elo + - games score oppo. draws
1 Fruit 1.5 99 24 24 500 57% 52 23%
2 Pawny+ 52 9 9 4000 57% -6 21%
3 Diablo 0.5.1 42 24 24 500 49% 52 24%
4 Rotor 0.4 33 24 24 500 47% 52 23%
5 Counter 1.2 33 22 22 500 47% 52 38%
6 Gaia 3.5 64bit 31 25 25 500 47% 52 13%
7 Muse 0.899b 9 24 24 500 44% 52 24%
8 Comet_B68 -81 25 26 500 33% 52 12%
9 Ares 1.004 -218 28 30 500 18% 52 13%

Edsel Apostol · Post by **Edsel Apostol** » Tue Aug 30, 2011 1:16 pm

I've been chasing a search instability bug in our engine where it is going berserk when it finds a mate in the search horizon. Might be due to the aspiration window being reset every iteration. Haven't solved it yet completely but some bugfixes makes it a little more stable.

Now back to testing the idea. I now only reduce bad captures when the LMR reduction is greater than or equal to 3. Seems to be more safe as the bad capture should be on the very end of the move list to be reduced. I have tried a lot of other ideas but this is the only one that's better, I might have not tried all of them ideas though.

Code: Select all

if (moveIsTactical(move)) newdepth -= ((LMRTable[depth][movesplayed] >= 3) ? (LMRTable[depth][movesplayed] - 2) : 0);
else newdepth -= LMRTable[depth][movesplayed];

Result is +6 elo with +-17 error margins. Around the same elo improvement as Don and Bob reported, though my test is the less accurate, and we have the Futility Pruning idea of bad captures that they seem not to implement.

Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)