Reducing/Pruning Bad Captures (SEE < 0)

Edsel Apostol · Post by **Edsel Apostol** » Sat Aug 20, 2011 1:53 am

bhlangonijr wrote:
Edsel Apostol wrote:I've tried reducing/pruning bad captures in the latest Hannibal. I've reduced a bad capture move just like what we do with LMR and prune it just like what we do with Futility Pruning with the exception that I also add the value of the captured piece.

I'm surprised to find the results very close between the one with the change and the base version. Has anyone tried this idea and have you noticed the same test results?

Here's my test results by the way:
Code: Select all
Games Completed = 1200 of 1200 (Avg game length = 58.391 sec)
Settings = Gauntlet/32MB/20000ms+200ms/M 1000cp for 12 moves, D 150 moves/EPD:D:\chess\tests\little_blitzer_2.6\NoomenCombined.epd
Time = 23653 sec elapsed, 0 sec remaining
 1.  Hannibal 20110819        	540.5/1200	411-530-259  	(L: m=396 t=0 i=0 a=134)	(D: r=143 i=37 f=30 s=4 a=45)	(tpm=448.5 d=13.9 nps=1230646)
 2.  spark-1.0                	63.5/120	45-38-37  	(L: m=13 t=0 i=0 a=25)	(D: r=19 i=7 f=5 s=0 a=6)	(tpm=383.0 d=12.1 nps=1783891)
 3.  Protector 1.4.0 x64      	60.5/120	42-41-37  	(L: m=11 t=0 i=0 a=30)	(D: r=22 i=9 f=3 s=0 a=3)	(tpm=454.0 d=11.9 nps=901816)
 4.  Spike 1.4                	56.5/120	43-50-27  	(L: m=11 t=0 i=0 a=39)	(D: r=20 i=5 f=0 s=1 a=1)	(tpm=453.3 d=12.4 nps=965890)
 5.  Gull 1.1 x64             	57.0/120	41-47-32  	(L: m=21 t=0 i=0 a=26)	(D: r=18 i=3 f=3 s=1 a=7)	(tpm=421.4 d=12.0 nps=2125374)
 6.  Gull 1.2 x64             	67.5/120	51-36-33  	(L: m=5 t=0 i=0 a=31)	(D: r=19 i=0 f=8 s=1 a=5)	(tpm=420.4 d=13.6 nps=1811320)
 7.  Critter 0.90 64-bit      	68.5/120	59-42-19  	(L: m=4 t=0 i=0 a=38)	(D: r=12 i=4 f=1 s=0 a=2)	(tpm=448.1 d=14.8 nps=1610554)
 8.  Stockfish 2.1.1 JA 64bit 	52.5/120	40-55-25  	(L: m=2 t=0 i=0 a=53)	(D: r=15 i=4 f=3 s=0 a=3)	(tpm=457.6 d=14.7 nps=1094947)
 9.  Komodo64 2.03 JA         	77.5/120	67-32-21  	(L: m=10 t=0 i=0 a=22)	(D: r=7 i=1 f=3 s=1 a=9)	(tpm=431.7 d=12.6 nps=1131603)
10.  Critter 1.2 64-bit       	70.5/120	63-42-15  	(L: m=2 t=0 i=0 a=40)	(D: r=7 i=2 f=2 s=0 a=4)	(tpm=430.7 d=15.0 nps=1483919)
11.  Houdini 1.5a x64         	85.5/120	79-28-13  	(L: m=0 t=0 i=0 a=28)	(D: r=4 i=2 f=2 s=0 a=5)	(tpm=385.4 d=13.0 nps=1676299)

Games Completed = 1200 of 1200 (Avg game length = 58.395 sec)
Settings = Gauntlet/32MB/20000ms+200ms/M 1000cp for 12 moves, D 150 moves/EPD:D:\chess\tests\little_blitzer_2.6\NoomenCombined.epd
Time = 25189 sec elapsed, 0 sec remaining
 1.  Hannibal 20110819x       	533.5/1200	393-526-281  	(L: m=399 t=0 i=0 a=127)	(D: r=167 i=45 f=38 s=2 a=29)	(tpm=449.9 d=13.9 nps=1162299)
 2.  spark-1.0                	66.0/120	50-38-32  	(L: m=14 t=0 i=0 a=24)	(D: r=16 i=8 f=7 s=0 a=1)	(tpm=389.9 d=11.9 nps=1652586)
 3.  Protector 1.4.0 x64      	62.5/120	40-35-45  	(L: m=13 t=0 i=0 a=22)	(D: r=26 i=9 f=4 s=0 a=6)	(tpm=446.6 d=11.9 nps=826591)
 4.  Spike 1.4                	58.0/120	40-44-36  	(L: m=11 t=0 i=0 a=33)	(D: r=34 i=1 f=1 s=0 a=0)	(tpm=453.5 d=12.3 nps=929081)
 5.  Gull 1.1 x64             	64.0/120	49-41-30  	(L: m=20 t=0 i=0 a=21)	(D: r=22 i=3 f=2 s=0 a=3)	(tpm=428.8 d=11.9 nps=2033459)
 6.  Gull 1.2 x64             	73.5/120	58-31-31  	(L: m=10 t=0 i=0 a=21)	(D: r=16 i=4 f=6 s=0 a=5)	(tpm=430.3 d=13.5 nps=1687940)
 7.  Critter 0.90 64-bit      	65.5/120	50-39-31  	(L: m=2 t=0 i=0 a=37)	(D: r=21 i=6 f=4 s=0 a=0)	(tpm=434.6 d=15.1 nps=1509492)
 8.  Stockfish 2.1.1 JA 64bit 	55.0/120	44-54-22  	(L: m=1 t=0 i=0 a=53)	(D: r=12 i=5 f=4 s=1 a=0)	(tpm=452.5 d=14.2 nps=1043395)
 9.  Komodo64 2.03 JA         	77.0/120	65-31-24  	(L: m=7 t=0 i=0 a=24)	(D: r=11 i=5 f=5 s=0 a=3)	(tpm=446.1 d=12.9 nps=1040259)
10.  Critter 1.2 64-bit       	68.0/120	58-42-20  	(L: m=2 t=0 i=0 a=40)	(D: r=6 i=4 f=3 s=1 a=6)	(tpm=427.1 d=15.0 nps=1410248)
11.  Houdini 1.5a x64         	77.0/120	72-38-10  	(L: m=1 t=0 i=0 a=37)	(D: r=3 i=0 f=2 s=0 a=5)	(tpm=385.2 d=12.8 nps=1569842)
20110819 is the base version. 20110819x is the version with the bad captures pruning/reduction idea.

One can notice that the 2 versions perform differently against the opponents. The 19x version performed better against stronger engines like Komodo, Critter1.2 and Houdini while performing worst against the weaker opponents. This suggests that maybe those strong engines are also doing this bad captures pruning/reduction ideas, though I might be wrong.

The difference in Elo is only 4 with the error bars at +-17. So which do you think is better, performing better against stronger engines or performing better against weaker engines?

Sometimes I consider to continue developing versions that perform a little bit worse than the best version when my intuition says that the idea has potential and just needs fine tuning.
Do you account for pinned pieces in your see function (specially the pinned ones against the king)? My intuition says that if you are using the see function to prune/reduce bad captures then you would want to have it the more accurate as possible.

I tried reducing/pruning bad captures based on the see and didn't get good results, although my see function doesn't account for pinned pieces. I should that try that in the next days.

Our SEE doesn't take into account pinned pieces to the king, so it is possible that we could improve the results by supporting this. Please let us know the results of your test.

Edsel Apostol · Post by **Edsel Apostol** » Sat Aug 20, 2011 2:01 am

zamar wrote:One could really expect that pruning/reducing bad captures is a great idea, but surprisingly we were never able to make this work for Stockfish.

Results seems to vary from engine to engine. It may depend on a lot of factors. If one already implements other prunings/reductions that has an overlap with the current idea being tested, the redundancy seems to give worst results. An idea may be good when being analyzed as a standalone feature but it may not work with the combination of other ideas. In computer chess programming, it is the collection of ideas that works well with each other that is important.

Ferdy · Post by **Ferdy** » Sat Aug 20, 2011 2:24 am

Evert wrote:
Ferdy wrote:Since introducing the bad capture reduction, I also introduced the capture killer scheme, same with non-capture killers but it is a capture, and it's see() is bad. Hope those sound capture sacrifices will be saved differently as capture killers here. Revise the move ordering for captures a bit so the capture LMR will be more defined.
I experimented with that in Jazz as well (where it's called a "good capture") and it seemed worse overall with initial testing, but I didn't try tweaking it. It's quite possible that a "bad SEE capture" is normally bad except in the one instance where it isn't in which case it's pointless to not reduce it.
Of course there's various ways in which this can be tweaked, for instance, where in the list do you order the "good capture"? Behind the killer moves? Ahead of other bad captures but behind quiet moves? Ahead of good captures?

That capture killer should definitely be above a capture with minus SEE. What might work for me may not work for you, but as you add new ideas, there comes a time that what works in the past may not work at present. The interesting part is testing. BTW I don't allow all illegal pawn captures in SEE, these are captures that are pinned to the king. A step forward in enhancing SEE accuracy.

Mincho Georgiev · Post by **Mincho Georgiev** » Sun Aug 21, 2011 8:53 am

Since Ben-Hur made an absolutely valid point, I would like to address a couple of issues that I have to reconsider before this test to have some sense.
First, without pinned-tests the results of see() are absolutely wrong which is a huge issue.
I'd resolved that already. Another thing that I've done last night was the discovery checks, since the lack of consideration for them leads to wrong see results again.
Regarding my particular use of see() (i use it only to evaluate the "presumably" bad captures) I have just one more issue to resolve.
That is the promotions (and also under-promotions) which happened to appear on the evaluated square during capture sequence. Since this is rare, I don't know if it's critical, but still - it produces WRONG results if not addressed. I think I will resolve that too before testing it.

Mincho Georgiev · Post by **Mincho Georgiev** » Mon Aug 22, 2011 11:45 pm

Test 1 - completed (TC = Blitz 1m.)
-----------------------
Pawny -> the base version.
Pawny+ -> same as above with reductions for see() < 0 applied to LMR.
Pawny- -> reference base version with see() without addressed pins and discovery checks.

Main Test:
----------
Pawny - Fruit 1.5; 1000 Games, +263 =260 -477, 39.3%, TP=-76 Elo
Pawny+ - Fruit 1.5; 1000 Games, +304 =236 -460, 42.2%, TP=-55 Elo

Elostat difference = 11 Elo in favor of Pawny+.

Reference test:
---------------
Pawny- - Fruit 1.5; 1000 Games, +275 =244 -481 39.7%, TP=-73 Elo

Approximately the same result as the base version.

More to follow...

Don · Post by **Don** » Tue Aug 23, 2011 5:02 am

zamar wrote:One could really expect that pruning/reducing bad captures is a great idea, but surprisingly we were never able to make this work for Stockfish.

Same here. I think captures are just too volatile to take any chances with.

bob · Post by **bob** » Tue Aug 23, 2011 5:58 am

Don wrote:
zamar wrote:One could really expect that pruning/reducing bad captures is a great idea, but surprisingly we were never able to make this work for Stockfish.
Same here. I think captures are just too volatile to take any chances with.

Here's a simple question to answer...

"Why is QxPa6 bad if the a6 P is defended by a P at b7, but Qa6 is reducible if there is a pawn at b7 but there was NO pawn on a6?"

Both are losing moves, both lose almost the same material. I see nothing that says one is different from the other, unless there is some idea of exposing the king at b8 or something. But that's independent of the SEE value...

I reduce losing captures. It isn't much of a gain, but it is a gain for me...

Don · Post by **Don** » Tue Aug 23, 2011 1:25 pm

bob wrote:
Don wrote:
zamar wrote:One could really expect that pruning/reducing bad captures is a great idea, but surprisingly we were never able to make this work for Stockfish.
Same here. I think captures are just too volatile to take any chances with.
Here's a simple question to answer...

"Why is QxPa6 bad if the a6 P is defended by a P at b7, but Qa6 is reducible if there is a pawn at b7 but there was NO pawn on a6?"

Both are losing moves, both lose almost the same material. I see nothing that says one is different from the other, unless there is some idea of exposing the king at b8 or something. But that's independent of the SEE value...

I reduce losing captures. It isn't much of a gain, but it is a gain for me...

Apparently it doesn't work for Stockfish and I can vouch for the fact that it does not work for Komodo either - believe me we have thoroughly checked this out. But I have a theory:

The static see() routine is not reliable in a tactical sense - it's a good guess at best but it's pretty weak and is not globally aware of anything. So a certain percentage of bad see() move are not blunders. But that is not the end of the story. Modern computer chess is all about taking risks and playing the odds. So why not reduce moves that seem to have a high probability of being bad?

Well, consider this. You can measure moves in terms of how much compulsion it places on the opponent. For example a check is high compulsion, it MUST be answered. A checkmate threat is high compulsion, it doesn't have to be answered but you lose if you don't. In a sense, even a bad capture tends to be a high compulsion move because it's a threat to win a piece - in other words the opponent must respond to it or else you make off with the booty.

In computer chess you are usually looking to EXTEND high compulsion moves, not reduce them. I'm not advocating that we extend (seemingly) losing captures but reducing them is going in the opposite direction.

If you imagine the cases where the "losing capture" is really a clever move, my guess is that the best response in most cases is not the obvious capture reply - so there is a good chance it TOO will get reduced so you become double blind.

The condensed version of all of this is that even a losing capture is a threat and probably should not be reduced.

At least that sounds good on paper ....

Steve Maughan · Post by **Steve Maughan** » Tue Aug 23, 2011 2:32 pm

Hi Don,

Don wrote:....The static see() routine is not reliable in a tactical sense...

Of course this is true, and probably the the crux of the problem.

So here's an idea - you may have already tested it but here goes. When you come across a SEE losing capture make the move and call the quiescent search. If, and only if, the score returned from the qsearch is "low" as expected, then reduce the normal search.

A couple of variations to try would be:

1. Only reduce if the qsearch result is sufficiently below alpha e.g. alpha - PAWN

2. Only do this if the remaining depth is significant (i.e. there is a reasonable likelihood of not dropping into the qsearch anyway)

This may work better for qsearchs which have checks for the first ply (or two).

Cheers,

Steve

Don · Post by **Don** » Tue Aug 23, 2011 2:45 pm

Steve Maughan wrote:Hi Don,

Don wrote:....The static see() routine is not reliable in a tactical sense...
Of course this is true, and probably the the crux of the problem.

So here's an idea - you may have already tested it but here goes. When you come across a SEE losing capture make the move and call the quiescent search. If, and only if, the score returned from the qsearch is "low" as expected, then reduce the normal search.

A couple of variations to try would be:

1. Only reduce if the qsearch result is sufficiently below alpha e.g. alpha - PAWN

2. Only do this if the remaining depth is significant (i.e. there is a reasonable likelihood of not dropping into the qsearch anyway)

This may work better for qsearchs which have checks for the first ply (or two).

Cheers,

Steve

That's a reasonable idea to try. However, the reduction itself is even better than a quies search and doesn't salvage the idea so I would suggest a combination of your ideas:

If losing capture:
reduce search using normal schedule, set scout to alpha-MARGIN

It's possible that we have already tried this. I have probably only tried about a million things in the last 3 years and don't remember all of them

Don

Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)