Reducing/Pruning Bad Captures (SEE < 0)

bob · Post by **bob** » Wed Aug 24, 2011 4:56 pm

Edsel Apostol wrote:
bhlangonijr wrote:
Edsel Apostol wrote:I've tried reducing/pruning bad captures in the latest Hannibal. I've reduced a bad capture move just like what we do with LMR and prune it just like what we do with Futility Pruning with the exception that I also add the value of the captured piece.

I'm surprised to find the results very close between the one with the change and the base version. Has anyone tried this idea and have you noticed the same test results?

Here's my test results by the way:
Code: Select all
Games Completed = 1200 of 1200 &#40;Avg game length = 58.391 sec&#41;
Settings = Gauntlet/32MB/20000ms+200ms/M 1000cp for 12 moves, D 150 moves/EPD&#58;D&#58;\chess\tests\little_blitzer_2.6\NoomenCombined.epd
Time = 23653 sec elapsed, 0 sec remaining
 1.  Hannibal 20110819        	540.5/1200	411-530-259  	&#40;L&#58; m=396 t=0 i=0 a=134&#41;	&#40;D&#58; r=143 i=37 f=30 s=4 a=45&#41;	&#40;tpm=448.5 d=13.9 nps=1230646&#41;
 2.  spark-1.0                	63.5/120	45-38-37  	&#40;L&#58; m=13 t=0 i=0 a=25&#41;	&#40;D&#58; r=19 i=7 f=5 s=0 a=6&#41;	&#40;tpm=383.0 d=12.1 nps=1783891&#41;
 3.  Protector 1.4.0 x64      	60.5/120	42-41-37  	&#40;L&#58; m=11 t=0 i=0 a=30&#41;	&#40;D&#58; r=22 i=9 f=3 s=0 a=3&#41;	&#40;tpm=454.0 d=11.9 nps=901816&#41;
 4.  Spike 1.4                	56.5/120	43-50-27  	&#40;L&#58; m=11 t=0 i=0 a=39&#41;	&#40;D&#58; r=20 i=5 f=0 s=1 a=1&#41;	&#40;tpm=453.3 d=12.4 nps=965890&#41;
 5.  Gull 1.1 x64             	57.0/120	41-47-32  	&#40;L&#58; m=21 t=0 i=0 a=26&#41;	&#40;D&#58; r=18 i=3 f=3 s=1 a=7&#41;	&#40;tpm=421.4 d=12.0 nps=2125374&#41;
 6.  Gull 1.2 x64             	67.5/120	51-36-33  	&#40;L&#58; m=5 t=0 i=0 a=31&#41;	&#40;D&#58; r=19 i=0 f=8 s=1 a=5&#41;	&#40;tpm=420.4 d=13.6 nps=1811320&#41;
 7.  Critter 0.90 64-bit      	68.5/120	59-42-19  	&#40;L&#58; m=4 t=0 i=0 a=38&#41;	&#40;D&#58; r=12 i=4 f=1 s=0 a=2&#41;	&#40;tpm=448.1 d=14.8 nps=1610554&#41;
 8.  Stockfish 2.1.1 JA 64bit 	52.5/120	40-55-25  	&#40;L&#58; m=2 t=0 i=0 a=53&#41;	&#40;D&#58; r=15 i=4 f=3 s=0 a=3&#41;	&#40;tpm=457.6 d=14.7 nps=1094947&#41;
 9.  Komodo64 2.03 JA         	77.5/120	67-32-21  	&#40;L&#58; m=10 t=0 i=0 a=22&#41;	&#40;D&#58; r=7 i=1 f=3 s=1 a=9&#41;	&#40;tpm=431.7 d=12.6 nps=1131603&#41;
10.  Critter 1.2 64-bit       	70.5/120	63-42-15  	&#40;L&#58; m=2 t=0 i=0 a=40&#41;	&#40;D&#58; r=7 i=2 f=2 s=0 a=4&#41;	&#40;tpm=430.7 d=15.0 nps=1483919&#41;
11.  Houdini 1.5a x64         	85.5/120	79-28-13  	&#40;L&#58; m=0 t=0 i=0 a=28&#41;	&#40;D&#58; r=4 i=2 f=2 s=0 a=5&#41;	&#40;tpm=385.4 d=13.0 nps=1676299&#41;

Games Completed = 1200 of 1200 &#40;Avg game length = 58.395 sec&#41;
Settings = Gauntlet/32MB/20000ms+200ms/M 1000cp for 12 moves, D 150 moves/EPD&#58;D&#58;\chess\tests\little_blitzer_2.6\NoomenCombined.epd
Time = 25189 sec elapsed, 0 sec remaining
 1.  Hannibal 20110819x       	533.5/1200	393-526-281  	&#40;L&#58; m=399 t=0 i=0 a=127&#41;	&#40;D&#58; r=167 i=45 f=38 s=2 a=29&#41;	&#40;tpm=449.9 d=13.9 nps=1162299&#41;
 2.  spark-1.0                	66.0/120	50-38-32  	&#40;L&#58; m=14 t=0 i=0 a=24&#41;	&#40;D&#58; r=16 i=8 f=7 s=0 a=1&#41;	&#40;tpm=389.9 d=11.9 nps=1652586&#41;
 3.  Protector 1.4.0 x64      	62.5/120	40-35-45  	&#40;L&#58; m=13 t=0 i=0 a=22&#41;	&#40;D&#58; r=26 i=9 f=4 s=0 a=6&#41;	&#40;tpm=446.6 d=11.9 nps=826591&#41;
 4.  Spike 1.4                	58.0/120	40-44-36  	&#40;L&#58; m=11 t=0 i=0 a=33&#41;	&#40;D&#58; r=34 i=1 f=1 s=0 a=0&#41;	&#40;tpm=453.5 d=12.3 nps=929081&#41;
 5.  Gull 1.1 x64             	64.0/120	49-41-30  	&#40;L&#58; m=20 t=0 i=0 a=21&#41;	&#40;D&#58; r=22 i=3 f=2 s=0 a=3&#41;	&#40;tpm=428.8 d=11.9 nps=2033459&#41;
 6.  Gull 1.2 x64             	73.5/120	58-31-31  	&#40;L&#58; m=10 t=0 i=0 a=21&#41;	&#40;D&#58; r=16 i=4 f=6 s=0 a=5&#41;	&#40;tpm=430.3 d=13.5 nps=1687940&#41;
 7.  Critter 0.90 64-bit      	65.5/120	50-39-31  	&#40;L&#58; m=2 t=0 i=0 a=37&#41;	&#40;D&#58; r=21 i=6 f=4 s=0 a=0&#41;	&#40;tpm=434.6 d=15.1 nps=1509492&#41;
 8.  Stockfish 2.1.1 JA 64bit 	55.0/120	44-54-22  	&#40;L&#58; m=1 t=0 i=0 a=53&#41;	&#40;D&#58; r=12 i=5 f=4 s=1 a=0&#41;	&#40;tpm=452.5 d=14.2 nps=1043395&#41;
 9.  Komodo64 2.03 JA         	77.0/120	65-31-24  	&#40;L&#58; m=7 t=0 i=0 a=24&#41;	&#40;D&#58; r=11 i=5 f=5 s=0 a=3&#41;	&#40;tpm=446.1 d=12.9 nps=1040259&#41;
10.  Critter 1.2 64-bit       	68.0/120	58-42-20  	&#40;L&#58; m=2 t=0 i=0 a=40&#41;	&#40;D&#58; r=6 i=4 f=3 s=1 a=6&#41;	&#40;tpm=427.1 d=15.0 nps=1410248&#41;
11.  Houdini 1.5a x64         	77.0/120	72-38-10  	&#40;L&#58; m=1 t=0 i=0 a=37&#41;	&#40;D&#58; r=3 i=0 f=2 s=0 a=5&#41;	&#40;tpm=385.2 d=12.8 nps=1569842&#41;
20110819 is the base version. 20110819x is the version with the bad captures pruning/reduction idea.

One can notice that the 2 versions perform differently against the opponents. The 19x version performed better against stronger engines like Komodo, Critter1.2 and Houdini while performing worst against the weaker opponents. This suggests that maybe those strong engines are also doing this bad captures pruning/reduction ideas, though I might be wrong.

The difference in Elo is only 4 with the error bars at +-17. So which do you think is better, performing better against stronger engines or performing better against weaker engines?

Sometimes I consider to continue developing versions that perform a little bit worse than the best version when my intuition says that the idea has potential and just needs fine tuning.
Do you account for pinned pieces in your see function (specially the pinned ones against the king)? My intuition says that if you are using the see function to prune/reduce bad captures then you would want to have it the more accurate as possible.

I tried reducing/pruning bad captures based on the see and didn't get good results, although my see function doesn't account for pinned pieces. I should that try that in the next days.
Our SEE doesn't take into account pinned pieces to the king, so it is possible that we could improve the results by supporting this. Please let us know the results of your test.

It dates back quite a while, but Joel Rivat (Chess Guru, no longer active it seems) and I had a similar discussion about SEE. He was using it for a few things I was not, and he thought that making it more accurate would be an improvement.

I spent some time, adding absolute pins, and saw Crafty play a little worse. Mainly because SEE is used quite a lot, and slowing it down has an effect. I think there are enough errors in the SEE concept (namely that the best move is often not a capture) that its error rate will always be higher than we'd like. But if you use it right, and it is correct more often than not, it should result in a gain. If you can squeeze more accuracy without losing speed, it should always be a gain of some sort. But if there is a speed trade-off, then it becomes very unclear and a lot of testing is needed to get down to the error bar range necessary to measure the improvement.

zamar · Post by **zamar** » Wed Aug 24, 2011 6:54 pm

Don wrote:
zamar wrote:
Don wrote: If losing capture:
reduce search using normal schedule, set scout to alpha-MARGIN

It's possible that we have already tried this. I have probably only tried about a million things in the last 3 years and don't remember all of them

Don
SF got something like +5 elo using this idea. The problem is that although engine is a bit stronger, it's much weaker in tactical puzzles (which often involve a bad capture or even two of them!).
But Joona claimed earlier that it didn't work! I would kill for 5 ELO! Are you saying that would give up 5 ELO for better tactics but a weaker program?

Earlier I was referring strictly bad capturing reducing/pruning and it didn't work. But reducing in low depths aggressively AND tweaking alpha, gave us something (take a look at SF 2.1), but the downside is (as I said) that engine became much weaker in tactics. As H.G wisely put it, if capture is futile _enough_ it can be reduced aggressively.

But still I view this as a separate thing than LMR, because we apply this technique only close to leaves, reduce much more aggressively and tweak alpha around 100cp.

zamar · Post by **zamar** » Wed Aug 24, 2011 7:09 pm

bob wrote: However, in the case I gave, the last move could have been Rh6 or Rxh6. You can't tell the difference unless you see the position prior to that move. Given that, what would say "reduce the non-capture, but not the capture, since the resulting position is identical in all other aspects.."

In classical poker you can get four aces after changing all the cards, or after changing only two cards (assuming already have three aces). The latter is just much more likely to happen.

In chess you can reach a critical position after silent non-capture or bad capture. The latter is just much more likely to happen.

bob · Post by **bob** » Wed Aug 24, 2011 7:30 pm

zamar wrote:
bob wrote: However, in the case I gave, the last move could have been Rh6 or Rxh6. You can't tell the difference unless you see the position prior to that move. Given that, what would say "reduce the non-capture, but not the capture, since the resulting position is identical in all other aspects.."
In classical poker you can get four aces after changing all the cards, or after changing only two cards (assuming already have three aces). The latter is just much more likely to happen.

In chess you can reach a critical position after silent non-capture or bad capture. The latter is just much more likely to happen.

What is this based on? There are actually fewer capture moves than non-captures... And chess programs hang pieces all over the board, as I see every time I dump a trace looking for a bug...

bob · Post by **bob** » Wed Aug 24, 2011 7:38 pm

I'm actually re-testing that idea right now. I had tried it once and did not keep it, but I kept the version that tried it, so rather than tracking thru all the PGN data, I am re-running. Captures will get reduced by 1 now, rather than 2, when SEE < 0....

More later...

Mincho Georgiev · Post by **Mincho Georgiev** » Wed Aug 24, 2011 8:15 pm

A am using the approx. same ordering scheme as yours. The see < 0 moves are at the end of the list. "Quiet" moves are above them. I am reducing only by 1 PLY for any move. That gives me a lot of room for testing and experiments. And yes, unfortunately, more accurate SEE is slowing down the entire program, since it's used a lot. Maybe that eats up the benefits of its correctness (if any).

bob · Post by **bob** » Wed Aug 24, 2011 9:26 pm

Mincho Georgiev wrote:A am using the approx. same ordering scheme as yours. The see < 0 moves are at the end of the list. "Quiet" moves are above them. I am reducing only by 1 PLY for any move. That gives me a lot of room for testing and experiments. And yes, unfortunately, more accurate SEE is slowing down the entire program, since it's used a lot. Maybe that eats up the benefits of its correctness (if any).

That was my observation when Joel and I discussed this years ago. It is easy to make it more accurate. But also slower. This is an age-old balancing act. Only have 1400 games so far, with this change at +13 Elo. Something tells me that is not going to hold as it is too simple a change. Had to restart once, as my first fix was not exactly right and it was reducing where it should not near the tips, which was killing things to the tune of -40 or so...

bob · Post by **bob** » Wed Aug 24, 2011 9:37 pm

bob wrote:
Mincho Georgiev wrote:A am using the approx. same ordering scheme as yours. The see < 0 moves are at the end of the list. "Quiet" moves are above them. I am reducing only by 1 PLY for any move. That gives me a lot of room for testing and experiments. And yes, unfortunately, more accurate SEE is slowing down the entire program, since it's used a lot. Maybe that eats up the benefits of its correctness (if any).
That was my observation when Joel and I discussed this years ago. It is easy to make it more accurate. But also slower. This is an age-old balancing act. Only have 1400 games so far, with this change at +13 Elo. Something tells me that is not going to hold as it is too simple a change. Had to restart once, as my first fix was not exactly right and it was reducing where it should not near the tips, which was killing things to the tune of -40 or so...

Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...

JVMerlino · Post by **JVMerlino** » Wed Aug 24, 2011 9:40 pm

bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...

Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.

jm

bob · Post by **bob** » Wed Aug 24, 2011 9:54 pm

JVMerlino wrote:
bob wrote:Now down to "dead even" with last version, but after only 6000 games. That's usually the way it goes...
Good to know that it isn't only my program that regularly shows something like 50+ ELO through the first 20% of the testing, only to end up with no improvement at all.

jm

I don't know whether it is an artifact introduced by BayesElo, but I generally see an Elo that is too high, and then it drops back down to the previous level (or occasionally lower). I don't know why I don't see just as many where the elo starts off too low and climbs.

I have even randomized my starting positions a couple of times, just to make sure that favorable positions are not tested first to inflate the Elo, or that bad ones are done last...

No explanation, just an observation.

Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)

Re: Reducing/Pruning Bad Captures (SEE < 0)