Natural TB

syzygy · Post by **syzygy** » Sun Jun 12, 2016 5:08 pm

jhellis3 wrote:Here is the fix applied to vanilla SF if anyone is interested:

https://github.com/jhellis3/Stockfish/tree/tb_fix_1

[D]8/2P1P3/3k4/8/8/4K3/P2p1p2/8 b - -

512MB Hash, Single Thread:

Vanilla (with TBs) @ ~200 seconds: Depth 51, score +123.xx, short line displayed

Patch @ ~200 seconds: Depth 49, score Mate in 14, full line displayed (starts returning mate scores at d31)

Vanilla (sans TBs) @ ~80 seconds: Depth 40, score Mate in 14, full line displayed (starts returning mate scores at d35)

I just implemented my proposal:

Code: Select all

info depth 34 seldepth 36 multipv 1 score cp 12351 nodes 7894099 nps 2877907 hashfull 57 tbhits 358787 time 2743 pv d2d1q c7c8n d6c5 e3f2
info depth 35 seldepth 45 multipv 1 score mate 58 lowerbound nodes 24467190 nps 3480892 hashfull 157 tbhits 1047957 time 7029 pv d2d1q
info depth 35 seldepth 47 multipv 1 score mate 49 lowerbound nodes 28435778 nps 3592644 hashfull 164 tbhits 1206702 time 7915 pv d2d1q
info depth 35 seldepth 50 multipv 1 score mate 35 lowerbound nodes 35874043 nps 3580958 hashfull 224 tbhits 1518902 time 10018 pv d2d1q
info depth 35 seldepth 50 multipv 1 score mate 18 nodes 42810493 nps 3657141 hashfull 241 tbhits 1797828 time 11706 pv d2d1q c7c8n d6c5 e3f2 d1a4 f2e3 a4e8 c8b6 c5b6 e3d4 e8e7 d4d5 b6b5 d5d4 e7e6 a2a3 e6f5 d4e3 b5c4 a3a4 c4d5 a4a5 f5e4 e3f2 d5d4 a5a6 e4f4 f2g1 d4e3 a6a7 f4g4 g1f1 g4g3 a7a8q g3f2
...
info depth 38 seldepth 50 multipv 1 score mate 14 nodes 74681482 nps 3813197 hashfull 354 tbhits 2914349 time 19585 pv d2d1q c7c8n d6c5 e3f2 d1d7 f2e3 d7c8 a2a4 c8d7 a4a5 d7e7 e3d3 c5d5 a5a6 e7c7 d3e2 d5e4 e2d2 e4d4 d2e2 c7g3 a6a7 g3g2 e2e1 d4d3 a7a8r g2g1

My modification:

Code: Select all

diff --git a/src/search.cpp b/src/search.cpp
index 995204f..505df28 100644
--- a/src/search.cpp
+++ b/src/search.cpp
@@ -561,7 +561,7 @@ namespace &#123;
     Key posKey;
     Move ttMove, move, excludedMove, bestMove;
     Depth extension, newDepth, predictedDepth;
-    Value bestValue, value, ttValue, eval, nullValue;
+    Value bestValue, value, ttValue, eval, nullValue, maxValue;
     bool ttHit, inCheck, givesCheck, singularExtensionNode, improving;
     bool captureOrPromotion, doFullDepthSearch, moveCountPruning;
     Piece moved_piece;
@@ -572,6 +572,7 @@ namespace &#123;
     inCheck = pos.checkers&#40;);
     moveCount = quietCount =  ss->moveCount = 0;
     bestValue = -VALUE_INFINITE;
+    maxValue = VALUE_INFINITE;
     ss->ply = &#40;ss-1&#41;->ply + 1;
 
     // Check for the available remaining time
@@ -667,11 +668,29 @@ namespace &#123;
                        &#58; v >  drawScore ?  VALUE_MATE - MAX_PLY - ss->ply
                                         &#58;  VALUE_DRAW + 2 * v * drawScore;
 
-                tte->save&#40;posKey, value_to_tt&#40;value, ss->ply&#41;, BOUND_EXACT,
-                          std&#58;&#58;min&#40;DEPTH_MAX - ONE_PLY, depth + 6 * ONE_PLY&#41;,
-                          MOVE_NONE, VALUE_NONE, TT.generation&#40;));
-
-                return value;
+                if &#40;abs&#40;v&#41; <= drawScore&#41; &#123;
+                    tte->save&#40;posKey, value_to_tt&#40;value, ss->ply&#41;, BOUND_EXACT,
+                              depth, MOVE_NONE, VALUE_NONE, TT.generation&#40;));
+                    return value;
+                &#125;
+                if &#40;v < -drawScore&#41; &#123;
+                    if &#40;alpha >= value&#41; &#123;
+                        tte->save&#40;posKey, value_to_tt&#40;value, ss->ply&#41;,
+                                  BOUND_UPPER, depth, MOVE_NONE, VALUE_NONE,
+                                  TT.generation&#40;));
+                        return value;
+                    &#125;
+                    maxValue = value;
+                &#125;
+                else &#123;
+                    if &#40;beta <= value&#41; &#123;
+                        tte->save&#40;posKey, value_to_tt&#40;value, ss->ply&#41;,
+                                  BOUND_LOWER, depth, MOVE_NONE, VALUE_NONE,
+                                  TT.generation&#40;));
+                        return value;
+                    &#125;
+                    bestValue = value;
+                &#125;
             &#125;
         &#125;
     &#125;
@@ -1125,6 +1144,9 @@ moves_loop&#58; // When in check search starts from here
             &#40;ss-5&#41;->counterMoves->update&#40;pos.piece_on&#40;prevSq&#41;, prevSq, bonus&#41;;
     &#125;
 
+    if &#40;bestValue > maxValue&#41;
+        bestValue = maxValue;
+
     tte->save&#40;posKey, value_to_tt&#40;bestValue, ss->ply&#41;,
               bestValue >= beta ? BOUND_LOWER &#58;
               PvNode && bestMove ? BOUND_EXACT &#58; BOUND_UPPER,

So:
- probe as early as you can;
- if draw or outside (alpha,beta), take the TB cutoff;
- otherwise continue to search the subtree, but make sure the value returned is at least TBwin (bestValue = value) or at most -TBwin (maxValue = value) as the case may be.

It seems sufficient to test against maxValue only at the end of search().

syzygy · Post by **syzygy** » Sun Jun 12, 2016 6:00 pm

The modification above does not remove the "pos.rule50_count() == 0" probing condition.

If I remove it, the result is quite disastrous. The number of TB hits goes up considerably and total nodes kind of explodes. After 8 minutes (single-threaded, 512 MB) the search has only finished d18:

Code: Select all

info depth 18 seldepth 36 multipv 1 score cp 12351 nodes 2185124782 nps 4606362 hashfull 22 tbhits 330581211 time 474371 pv d2d1q c7c8n d6d7 e7e8b d7e8 e3f2

This is pretty useless. (And it surprises me that the effect is so huge.)

The reason for the sky-high node count is the fact that the node counter is increased in Position::do_move(). The probing code also calls do_move(), which inflates total node count and also nps. It may be more "correct" to increase node count in search() and qsearch()...

One obvious way to decrease the number of calls to probe_wdl() without maintaining the "rule50_count() == 0" condition is by making use of the fact that probe_wdl() will return a TB win if the parent node is known to be a TB loss. (A crude attempt at this indeed seems to improve matters.)

Keeping the rule50_count() == 0 condition seems a lot simpler.

Something else: It could be considered whether TB scores should be stored with increased depth and/or as something else than EXACT bounds. This depends on the TT replacement policy.

syzygy · Post by **syzygy** » Sun Jun 12, 2016 7:55 pm

My modification does well at the position from this thread (hash 512MB, 1 thread, 6-piece TBs):
[D]8/8/4p3/2bP4/8/8/p7/k2KN3 w - - 0 1

Code: Select all

info depth 27 seldepth 9 multipv 1 score cp 0 nodes 33311 nps 336474 tbhits 2311 time 99 pv d1c2 e6d5
info depth 28 seldepth 26 multipv 1 score cp 12347 nodes 224292 nps 1156144 tbhits 6833 time 194 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3g5 e1g2 g5e7 g2e3 e7f6 e3f1 e4e3 f1e3 f6g5
info depth 29 seldepth 28 multipv 1 score cp 12348 nodes 252954 nps 1144588 tbhits 9428 time 221 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3b6 e1g2 e4e3 g2e3 b6a5 c2c1 a5d2 c1d2 a1b1 d7d8q a2a1q
info depth 30 seldepth 33 multipv 1 score mate 12 nodes 566416 nps 1627632 tbhits 26980 time 348 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3b6 e1g2 b6a5 g2e3 a5b6 e3c4 e4e3 c4e3 b6a5 e3c4 a5d2 d7d8q d2e3 d8d1 e3c1 d1c1
...
info depth 37 seldepth 33 multipv 1 score mate 10 nodes 788348 nps 1922800 tbhits 37192 time 410 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3b6 e1g2 b6a5 g2e3 a5b6 e3f5 e4e3 d7d8b b6d8 f5d4 d8b6 d4b3

Without TBs the mate is found at d28, but it takes longer:

Code: Select all

info depth 27 seldepth 16 multipv 1 score cp 0 nodes 365516 nps 987881 tbhits 0 time 370 pv d1c2 e6d5 e1d3 c5a3 d3f4 d5d4 f4d3 a3f8 d3c1 d4d3 c1d3 f8a3 d3c1 a3c1 c2c1
info depth 28 seldepth 37 multipv 1 score mate 11 nodes 1970427 nps 2280586 tbhits 0 time 864 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3g5 e1g2 e4e3 g2e3 g5h4 c2c1 h4g5 d7d8q g5e3 c1c2 e3d2 d8h8 d2c3 h8c3
info depth 29 seldepth 37 multipv 1 score mate 11 nodes 1990360 nps 2293041 tbhits 0 time 868 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3g5 e1g2 e4e3 g2e3 g5h4 c2c1 h4g5 d7d8q g5e3 c1c2 e3d2 d8h8 d2c3 h8c3
info depth 30 seldepth 37 multipv 1 score mate 10 nodes 2051117 nps 2333466 tbhits 0 time 879 pv d1c1 c5e3 c1c2 e6e5 d5d6 e5e4 d6d7 e3g5 e1g2 e4e3 g2e3 g5h4 e3c4 h4e1 d7d8q e1d2 d8h8 d2c3 h8c3

Unfortunately it doesn't do as well on another position mentioned in this thread:
[D]8/4bk2/4br2/8/8/8/8/QRRK4 w - - 0 1
Without TBs SF reports a mate score from d25:

Code: Select all

info depth 25 seldepth 44 multipv 1 score mate 16 nodes 61908611 nps 3295114 hashfull 509 tbhits 0 time 18788 pv b1b7 e6g4 d1d2 f6d6 d2e3 d6e6 e3f4 g4e2 a1h8 e6f6 f4g3 f6g6 g3f2 g6e6 h8h7 f7f6 b7e7 e6e7 c1c6 e7e6 h7h6 f6e5 h6e6 e5d4 e6e2 d4d5 e2b5 d5d4 c6c4 d4d3 b5d5

With TBs a win score from depth 4 (which wins the game and so the rest is only cosmetic), but mate takes much longer:

Code: Select all

info depth 4 seldepth 6 multipv 1 score cp 12351 nodes 1042 nps 130250 tbhits 14 time 8 pv a1f6 f7e8 f6e6 e8f8 c1c8 f8g7
...
info depth 7 seldepth 10 multipv 1 score cp 12352 nodes 2376 nps 169714 tbhits 86 time 14 pv a1f6 e7f6
...
info depth 50 seldepth 53 multipv 1 score cp 12352 nodes 36908220 nps 2662546 hashfull 365 tbhits 2837752 time 13862 pv a1f6 e7f6
info depth 51 currmove a1f6 currmovenumber 1
info depth 51 seldepth 76 multipv 1 score mate 57 lowerbound nodes 1062299617 nps 3390624 hashfull 996 tbhits 48486416 time 313305 pv a1f6
info depth 51 currmove a1f6 currmovenumber 1
info depth 51 seldepth 77 multipv 1 score mate 48 lowerbound nodes 1457258916 nps 3482391 hashfull 999 tbhits 63632876 time 418465 pv a1f6
info depth 51 currmove a1f6 currmovenumber 1
info depth 51 seldepth 77 multipv 1 score mate 34 lowerbound nodes 3273503521 nps 3476001 hashfull 999 tbhits 141338407 time 941744 pv a1f6
info depth 51 currmove a1f6 currmovenumber 1

And it's still the queen sac....
Why is Rb7 or Rc7 not found earlier? Apparently they are getting reduced a lot because Qxf6+ (correcfly) is assigned a winning score?

hgm · Post by **hgm** » Sun Jun 12, 2016 8:32 pm

Because you get lots of checks before anything can be traded?

jhellis3 · Post by **jhellis3** » Sun Jun 12, 2016 8:33 pm

Keeping the rule50_count() == 0 condition seems a lot simpler.

Yes, it is the simplest, safest, and most efficient solution AFAICS.

Why is Rb7 or Rc7 not found earlier? Apparently they are getting reduced a lot because Qxf6+ (correcfly) is assigned a winning score?

Well not just that... It is because you are capping bestValue (combined with the way you score the nodes), which really isn't necessary. The TBs + rule50 guarantee you a lower-bound (for a win), and the only scores that can be higher is mate or a shorter DTZ TB win.

If you take a look at what I finally ended up doing, it is pretty similar to your approach. The main difference being I don't cap bestValue or even bother returning at all if not a draw.

https://github.com/official-stockfish/S ... 3:tb_fix_2

In practice, there is no real risk to this approach (unless you get extremely unlucky)... I suppose if one wanted to be absolutely certain, they could just always return for PvNodes.

syzygy · Post by **syzygy** » Sun Jun 12, 2016 9:07 pm

jhellis3 wrote:
Why is Rb7 or Rc7 not found earlier? Apparently they are getting reduced a lot because Qxf6+ (correcfly) is assigned a winning score?
Well not just that... It is because you are capping bestValue (combined with the way you score the nodes), which really isn't necessary. The TBs + rule50 guarantee you a lower-bound (for a win), and the only scores that can be higher is mate or a shorter DTZ TB win.

Unless I have somehow introduced something funny, it seems to me that SF apparently simply can't find a mate score for Rb7 and Rc7 (until very very late) once some other move is getting a very high score.

In this case Qxf6 is achieving a very high score from early on, because the TBs quickly show that it is winning. Somehow keeping the score for Qxf6 lower than the engine can see it should be might help SF find a mate for Rc1 or Rb1, but that is an ugly hack in my eyes...

The maxValue cap on bestValue should play a role only if capturing into the TBs results in a TB loss.
But you are probably referring to setting bestValue to TB win in case beta > TBwin. Maybe things will usually go fine without that, but I'd like to understand better why it delays finding mates in other lines.

Yay, SF finally found the Rb7 mate:

Code: Select all

info depth 51 seldepth 77 multipv 1 score mate 19 nodes 8766971843 nps 3428901 hashfull 999 tbhits 349049621 time 2556787 pv a1f6 f7f6 c1c6 f6f7 d1e2 e6d5 c6c7 f7e6 b1b6 e7d6 c7h7 d5c4 e2e3 e6d5 h7d7 d5c5 d7d6 c4b5 d6h6 b5a4 e3d3 a4e8 b6f6 c5b4 f6f4 b4a5 f4f5 a5b4 h6b6 b4a4 d3c3 a4a3 f5a5 e8a4 b6b4 a3a2 a5a4
info depth 52 seldepth 77 multipv 1 score mate 14 nodes 16408364348 nps 3397462 hashfull 999 tbhits 603973755 time 4829594 pv b1b7 e6g4 d1d2 f6d6 d2e3 d6e6 e3f4 g4h3 c1h1 e6f6 a1f6 f7f6 h1h3 e7f8 f4e4 f6e6 h3h7 e6d6 b7f7 d6c5 f7f8 c5c4 f8b8 c4c3 h7h2 c3c4 h2c2

syzygy · Post by **syzygy** » Sun Jun 12, 2016 9:22 pm

In this position, SF starts with these scores:

Code: Select all

info depth 1 seldepth 1 multipv 1 score cp 869 nodes 78 nps 26000 tbhits 0 time 3 pv b1b7
info depth 2 seldepth 3 multipv 1 score cp 891 nodes 197 nps 65666 tbhits 0 time 3 pv b1b7 f6f1 d1c2
info depth 3 seldepth 6 multipv 1 score cp 1945 nodes 794 nps 132333 tbhits 3 time 6 pv a1f6 f7e8 f6e6
info depth 4 seldepth 6 multipv 1 score cp 12351 nodes 1042 nps 130250 tbhits 14 time 8 pv a1f6 f7e8 f6e6 e8f8 c1c8 f8g7

So without any search, Rb7 already scores +8.69. The search then quickly finds that Qxf6 is even better.

Now, one might dislike a move like Qxf6, but the fact is that (thanks to TBs) it allows the engine to see very quickly that this position is a certain win.

Now I remove the "bestValue = value" line and see what we get:

Code: Select all

info depth 1 seldepth 1 multipv 1 score cp 869 nodes 78 nps 39000 tbhits 0 time 2 pv b1b7
info depth 2 seldepth 3 multipv 1 score cp 891 nodes 197 nps 65666 tbhits 0 time 3 pv b1b7 f6f1 d1c2
info depth 3 seldepth 6 multipv 1 score cp 892 nodes 743 nps 148600 tbhits 0 time 5 pv a1a7 f6f1 d1d2
info depth 4 seldepth 6 multipv 1 score cp 880 nodes 1159 nps 165571 tbhits 0 time 7 pv b1b7 e6g4 d1d2 f6f2 d2e3 f2e2 e3d3
info depth 5 seldepth 8 multipv 1 score cp 893 nodes 2748 nps 196285 tbhits 0 time 14 pv d1d2 f6f5 c1g1 e7g5 d2d3 f5f3 d3e4
info depth 6 seldepth 10 multipv 1 score cp 878 nodes 5944 nps 228615 tbhits 0 time 26 pv d1d2 f6f5 c1g1 e7g5 d2c2 f5c5 c2d1 e6d7
info depth 7 seldepth 11 multipv 1 score cp 945 nodes 16004 nps 280771 tbhits 0 time 57 pv b1b7 e6g4 d1d2 f6f2 d2e3 f2f3 e3e4 f3f6 c1f1 f6f1 a1f1 f7e8
info depth 8 seldepth 12 multipv 1 score cp 967 nodes 24520 nps 314358 tbhits 0 time 78 pv b1b7 e6g4 d1d2 f6f2 d2e3 f2f3 e3e4 f3f6 c1f1 f6f1 a1f1 f7e6

Sure, the eval Rb7 steadily rises, but now the engine is no longer aware that the engine is a certain win.

For a long time Rb7's score bounces up and down between +18 ("upperbound") and +43 ("lowerbound"). Finally at d25:

Code: Select all

info depth 25 seldepth 44 multipv 1 score mate 16 nodes 61908611 nps 3289686 hashfull 509 tbhits 0 time 18819 pv b1b7 e6g4 d1d2 f6d6 d2e3 d6e6 e3f4 g4e2 a1h8 e6f6 f4g3 f6g6 g3f2 g6e6 h8h7 f7f6 b7e7 e6e7 c1c6 e7e6 h7h6 f6e5 h6e6 e5d4 e6e2 d4d5 e2b5 d5d4 c6c4 d4d3 b5d5

It works (because this position happens to have a mate that SF can find relatively quickly), but I don't find this very satisfactory. And I'm not at all convinced that in other positions this does not screw up the win...

jhellis3 · Post by **jhellis3** » Sun Jun 12, 2016 10:05 pm

Sure, the eval Rb7 steadily rises, but now the engine is no longer aware that the engine is a certain win.

This is solved by always returning for PvNodes. But, as I mentioned earlier, one would have to be extremely unlucky for this to ever actually matter. Only a single TB win hit and no other non-TB winning paths... And you don't complete the iteration...

syzygy · Post by **syzygy** » Sun Jun 12, 2016 10:08 pm

jhellis3 wrote:
Sure, the eval Rb7 steadily rises, but now the engine is no longer aware that the engine is a certain win.
This is solved by always returning for PvNodes. But, as I mentioned earlier, one would have to be extremely unlucky for this to ever actually matter. Only a single TB win hit and no other non-TB winning paths...

If you're guaranteed of non-TB winning paths, there is no point in using TBs.

Anyway, there should be a better solution. It seems to me that some reduction decisions could be done a bit more cleverly and perhaps also "more correctly". Where it really matters (scores around 0) SF is probably already doing the right thing, but I don't see why one very big score should reduce away other promising paths.

jhellis3 · Post by **jhellis3** » Sun Jun 12, 2016 10:10 pm

If you're guaranteed of non-TB winning paths, there is no point in using TBs.

Mate with more than 6 pieces on the board?

I mean I guess you could shut off TB access in positions where value is greater than known win.... oh wait....

Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB

Re: Natural TB