Incremental or non-incremental PST evaluation calcs

bob · Post by **bob** » Fri Jan 27, 2012 11:37 pm

Sven Schüle wrote:
bob wrote:
hgm wrote:
bob wrote:With todays programs, the trees are deep and narrow. Which is exactly the circumstance that doesn't favor incremental calculations. The old-style wide/bushy trees were quite good for incremental update, but I don't think it is nearly as clear an advantage today...

Worth actually quantifying at some point...
They are not that narrow: a branching ratio of 2 only doubles the incremental work per end leaf from 1 to 2 updates, as all earlier updates are shared between 2, 4, 8... nodes. And that is only infinitesimally smaller than double for a tree that is not infinitely deep, so depth has certainly nothing to do with it.

Only when the EBF smaller than 1.125 would you get substantial difference in the workload of incremental updating in an 8 ply tree,
Right. But I am talking about 24-30 ply searches in typical CCT-like tournaments...
The point of HGM was like this:

- You are at a node X where you call eval() (which is somewhere within QS most of the time). The effort for the incremental PST score update done during the previous move (the move made from a node 1 ply above X) counts by 100% for that eval call.

- Assuming EBF=2 (actually it is smaller in QS which makes HGM's point somewhat weaker), the PST update during the next-to-previous move (2 plies above X) only counts by 50% for eval() at X since on average that update is shared with a sibling of X.

- The PST update done 3 plies above X counts by 25%, and so on.

In total this is ~200% for EBF=2 (300% for EBF=1.5, 500% for EBF=1.25, is that possible within the area of only QS and "low depth" nodes?) and independent from the actual path length from root to X due to "sharing".

Sven

Here is what my conjecture is based on. In a "wide search" you do an incremental update, and at the leaves, you use that value a bunch of times, as you successively make each move and then do a static eval (assuming no captures for simplicity). In a deeper/narrower search, you use that last incremental value fewer times. But you are also searching deeper. So you did more of 'em to get to that final eval call.

Whether this is significant or not, I don't know. Another idea is the normal "procrastination" idea, where you want to wait until the last minute to do something, because you might not have to do it at all, due to pruning...

I did a lot of that stuff in Cray Blitz. And I did it a lot in early Crafty versions, but it is much cleaner to have the eval stuff in Evaluate() rather than scattered around in Make/Unmake and such... The only thing I do incrementally is material, because I use that inside the search tree to make decisions.

Don · Post by **Don** » Fri Jan 27, 2012 11:59 pm

bob wrote:
Sven Schüle wrote:
bob wrote:
hgm wrote:
bob wrote:With todays programs, the trees are deep and narrow. Which is exactly the circumstance that doesn't favor incremental calculations. The old-style wide/bushy trees were quite good for incremental update, but I don't think it is nearly as clear an advantage today...

Worth actually quantifying at some point...
They are not that narrow: a branching ratio of 2 only doubles the incremental work per end leaf from 1 to 2 updates, as all earlier updates are shared between 2, 4, 8... nodes. And that is only infinitesimally smaller than double for a tree that is not infinitely deep, so depth has certainly nothing to do with it.

Only when the EBF smaller than 1.125 would you get substantial difference in the workload of incremental updating in an 8 ply tree,
Right. But I am talking about 24-30 ply searches in typical CCT-like tournaments...
The point of HGM was like this:

- You are at a node X where you call eval() (which is somewhere within QS most of the time). The effort for the incremental PST score update done during the previous move (the move made from a node 1 ply above X) counts by 100% for that eval call.

- Assuming EBF=2 (actually it is smaller in QS which makes HGM's point somewhat weaker), the PST update during the next-to-previous move (2 plies above X) only counts by 50% for eval() at X since on average that update is shared with a sibling of X.

- The PST update done 3 plies above X counts by 25%, and so on.

In total this is ~200% for EBF=2 (300% for EBF=1.5, 500% for EBF=1.25, is that possible within the area of only QS and "low depth" nodes?) and independent from the actual path length from root to X due to "sharing".

Sven
Here is what my conjecture is based on. In a "wide search" you do an incremental update, and at the leaves, you use that value a bunch of times, as you successively make each move and then do a static eval (assuming no captures for simplicity). In a deeper/narrower search, you use that last incremental value fewer times. But you are also searching deeper. So you did more of 'em to get to that final eval call.

Whether this is significant or not, I don't know. Another idea is the normal "procrastination" idea, where you want to wait until the last minute to do something, because you might not have to do it at all, due to pruning...

I did a lot of that stuff in Cray Blitz. And I did it a lot in early Crafty versions, but it is much cleaner to have the eval stuff in Evaluate() rather than scattered around in Make/Unmake and such... The only thing I do incrementally is material, because I use that inside the search tree to make decisions.

Komodo's evaluation is nearly completely isolated. I could basically plug in a different eval.c module and have a completely different evaluation. The only thing that breaks this nice separation is the piece square tables but even those are populated by an initialization routine inside eval.c

Do the incrementally updated piece square tables buy us anything other that some tiny bit of speed? The intent was that it would be quite nice to always have a score estimate ready even if quite crude, but the reality is that we have found no good use for it.

jwes · Post by **jwes** » Sat Jan 28, 2012 6:09 am

bob wrote:
Sven Schüle wrote:
bob wrote:
hgm wrote:
bob wrote:With todays programs, the trees are deep and narrow. Which is exactly the circumstance that doesn't favor incremental calculations. The old-style wide/bushy trees were quite good for incremental update, but I don't think it is nearly as clear an advantage today...

Worth actually quantifying at some point...
They are not that narrow: a branching ratio of 2 only doubles the incremental work per end leaf from 1 to 2 updates, as all earlier updates are shared between 2, 4, 8... nodes. And that is only infinitesimally smaller than double for a tree that is not infinitely deep, so depth has certainly nothing to do with it.

Only when the EBF smaller than 1.125 would you get substantial difference in the workload of incremental updating in an 8 ply tree,
Right. But I am talking about 24-30 ply searches in typical CCT-like tournaments...
The point of HGM was like this:

- You are at a node X where you call eval() (which is somewhere within QS most of the time). The effort for the incremental PST score update done during the previous move (the move made from a node 1 ply above X) counts by 100% for that eval call.

- Assuming EBF=2 (actually it is smaller in QS which makes HGM's point somewhat weaker), the PST update during the next-to-previous move (2 plies above X) only counts by 50% for eval() at X since on average that update is shared with a sibling of X.

- The PST update done 3 plies above X counts by 25%, and so on.

In total this is ~200% for EBF=2 (300% for EBF=1.5, 500% for EBF=1.25, is that possible within the area of only QS and "low depth" nodes?) and independent from the actual path length from root to X due to "sharing".

Sven
Here is what my conjecture is based on. In a "wide search" you do an incremental update, and at the leaves, you use that value a bunch of times, as you successively make each move and then do a static eval (assuming no captures for simplicity). In a deeper/narrower search, you use that last incremental value fewer times. But you are also searching deeper. So you did more of 'em to get to that final eval call.

Whether this is significant or not, I don't know. Another idea is the normal "procrastination" idea, where you want to wait until the last minute to do something, because you might not have to do it at all, due to pruning...

I did a lot of that stuff in Cray Blitz. And I did it a lot in early Crafty versions, but it is much cleaner to have the eval stuff in Evaluate() rather than scattered around in Make/Unmake and such... The only thing I do incrementally is material, because I use that inside the search tree to make decisions.

I ran a quick test where I added a counter to crafty which I incremented in makemove and reset to 0 in evaluate. I used this to calculate the average number of times makemove is called between calls to evaluate. For searches of .1 sec, it was 1.826, for searches of 30 sec, it was 2.003.
This suggests that incremental PSTs may well be faster than non-incremental.

bob · Post by **bob** » Sat Jan 28, 2012 7:09 am

jwes wrote:
bob wrote:
Sven Schüle wrote:
bob wrote:
hgm wrote:
bob wrote:With todays programs, the trees are deep and narrow. Which is exactly the circumstance that doesn't favor incremental calculations. The old-style wide/bushy trees were quite good for incremental update, but I don't think it is nearly as clear an advantage today...

Worth actually quantifying at some point...
They are not that narrow: a branching ratio of 2 only doubles the incremental work per end leaf from 1 to 2 updates, as all earlier updates are shared between 2, 4, 8... nodes. And that is only infinitesimally smaller than double for a tree that is not infinitely deep, so depth has certainly nothing to do with it.

Only when the EBF smaller than 1.125 would you get substantial difference in the workload of incremental updating in an 8 ply tree,
Right. But I am talking about 24-30 ply searches in typical CCT-like tournaments...
The point of HGM was like this:

- You are at a node X where you call eval() (which is somewhere within QS most of the time). The effort for the incremental PST score update done during the previous move (the move made from a node 1 ply above X) counts by 100% for that eval call.

- Assuming EBF=2 (actually it is smaller in QS which makes HGM's point somewhat weaker), the PST update during the next-to-previous move (2 plies above X) only counts by 50% for eval() at X since on average that update is shared with a sibling of X.

- The PST update done 3 plies above X counts by 25%, and so on.

In total this is ~200% for EBF=2 (300% for EBF=1.5, 500% for EBF=1.25, is that possible within the area of only QS and "low depth" nodes?) and independent from the actual path length from root to X due to "sharing".

Sven
Here is what my conjecture is based on. In a "wide search" you do an incremental update, and at the leaves, you use that value a bunch of times, as you successively make each move and then do a static eval (assuming no captures for simplicity). In a deeper/narrower search, you use that last incremental value fewer times. But you are also searching deeper. So you did more of 'em to get to that final eval call.

Whether this is significant or not, I don't know. Another idea is the normal "procrastination" idea, where you want to wait until the last minute to do something, because you might not have to do it at all, due to pruning...

I did a lot of that stuff in Cray Blitz. And I did it a lot in early Crafty versions, but it is much cleaner to have the eval stuff in Evaluate() rather than scattered around in Make/Unmake and such... The only thing I do incrementally is material, because I use that inside the search tree to make decisions.
I ran a quick test where I added a counter to crafty which I incremented in makemove and reset to 0 in evaluate. I used this to calculate the average number of times makemove is called between calls to evaluate. For searches of .1 sec, it was 1.826, for searches of 30 sec, it was 2.003.
This suggests that incremental PSTs may well be faster than non-incremental.

That's good data, although it would STILL depend on what you do incrementally. For example, Slate's incremental update to the attacks_from bitboards are murderously expensive. I found it faster to compute from scratch when they are needed. In MakeMove() you end up updating everything every time you call it. But eval takes a lot of lazy (early) exits that might not use that information. So it is still not completely clear to me that incremental will work.

sje · Post by **sje** » Sat Jan 28, 2012 7:55 am

bob wrote:That's good data, although it would STILL depend on what you do incrementally. For example, Slate's incremental update to the attacks_from bitboards are murderously expensive. I found it faster to compute from scratch when they are needed.

Perhaps you meant to say attacks_to instead of attacks_from. The latter is of course required for fast move generation, while the former can be done as needed if the need is not too frequent else an incremental update is faster.

Evert · Post by **Evert** » Sat Jan 28, 2012 8:25 am

When I added incremental attack maps to Sjaak (mainly for the purpose of more efficient attack tests) it came out as equally fast as calculating it as needed.
So I took them out again because it simplifies the code...

hgm · Post by **hgm** » Sat Jan 28, 2012 10:58 am

bob wrote:That's good data, although it would STILL depend on what you do incrementally. For example, Slate's incremental update to the attacks_from bitboards are murderously expensive. I found it faster to compute from scratch when they are needed. In MakeMove() you end up updating everything every time you call it. But eval takes a lot of lazy (early) exits that might not use that information. So it is still not completely clear to me that incremental will work.

Indeed, some data is more difficult to track incrementally than others. But PST (as well as material index and hash keys) are amongst the trivial ones. At least, some terms in them. Castling rights are already too complex to make incremental update of the hash keys for it worthwhile.

Attack maps are amongst the worst. The only way to make incremental update of those competitive is to take real care to postpone the update untill you are sure you need it. In Spartacus I have used attack maps for generation of captures in QS (staged in MVV order). For that you only need to update the opponent attacks in MakeMove, and they are a lot less effected by the move than your own attacks (especially if the move is a capture, which it commonly is). So you can use the partially updated map to quickly test if there are non-futile captures that are going to be searched. (You won't get a stand-pat cutoff by lazy eval, or you would not even have made the last move.) Only if there are non-futile captures, meaning you will search on, rather than take the immediate fail low), the attack map is updated for the moves of the side that did the last move.

I have currently disabled this again, because the existing code could not handle mixed slider / leaper compounds.

bob · Post by **bob** » Sat Jan 28, 2012 7:07 pm

sje wrote:
bob wrote:That's good data, although it would STILL depend on what you do incrementally. For example, Slate's incremental update to the attacks_from bitboards are murderously expensive. I found it faster to compute from scratch when they are needed.
Perhaps you meant to say attacks_to instead of attacks_from. The latter is of course required for fast move generation, while the former can be done as needed if the need is not too frequent else an incremental update is faster.

Actually I said what I meant.

When I wrote the first versions, I used the opposite terminology. Attacks_from was actually the attacks from the current square, not slates "where is this square attacked from?"

But you got the idea, anyway...

sje · Post by **sje** » Sat Jan 28, 2012 8:14 pm

atkfs = attacks from a square
atkts = attacks to a square

Code: Select all

&#91;&#93; dfen
FEN&#58; rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
&#91;&#93; dbbdb
merge     &#91;a1 b1 c1 d1 e1 f1 g1 h1 a2 b2 c2 d2 e2 f2 g2 h2 a7 b7 c7 d7 e7 f7 g7 h7 a8 b8 c8 d8 e8 f8 g8 h8&#93;
sweep     &#91;a1 c1 d1 f1 h1 a8 c8 d8 f8 h8&#93;
locbc&#91;w&#93;  &#91;a1 b1 c1 d1 e1 f1 g1 h1 a2 b2 c2 d2 e2 f2 g2 h2&#93;
locbc&#91;b&#93;  &#91;a7 b7 c7 d7 e7 f7 g7 h7 a8 b8 c8 d8 e8 f8 g8 h8&#93;
locbm&#91;wP&#93; &#91;a2 b2 c2 d2 e2 f2 g2 h2&#93;
locbm&#91;wN&#93; &#91;b1 g1&#93;
locbm&#91;wB&#93; &#91;c1 f1&#93;
locbm&#91;wR&#93; &#91;a1 h1&#93;
locbm&#91;wQ&#93; &#91;d1&#93;
locbm&#91;wK&#93; &#91;e1&#93;
locbm&#91;bP&#93; &#91;a7 b7 c7 d7 e7 f7 g7 h7&#93;
locbm&#91;bN&#93; &#91;b8 g8&#93;
locbm&#91;bB&#93; &#91;c8 f8&#93;
locbm&#91;bR&#93; &#91;a8 h8&#93;
locbm&#91;bQ&#93; &#91;d8&#93;
locbm&#91;bK&#93; &#91;e8&#93;
atkbc&#91;w&#93;  &#91;b1 c1 d1 e1 f1 g1 a2 b2 c2 d2 e2 f2 g2 h2 a3 b3 c3 d3 e3 f3 g3 h3&#93;
atkbc&#91;b&#93;  &#91;a6 b6 c6 d6 e6 f6 g6 h6 a7 b7 c7 d7 e7 f7 g7 h7 b8 c8 d8 e8 f8 g8&#93;
atkfs&#91;a1&#93; &#91;b1 a2&#93;
atkfs&#91;b1&#93; &#91;d2 a3 c3&#93;
atkfs&#91;c1&#93; &#91;b2 d2&#93;
atkfs&#91;d1&#93; &#91;c1 e1 c2 d2 e2&#93;
atkfs&#91;e1&#93; &#91;d1 f1 d2 e2 f2&#93;
atkfs&#91;f1&#93; &#91;e2 g2&#93;
atkfs&#91;g1&#93; &#91;e2 f3 h3&#93;
atkfs&#91;h1&#93; &#91;g1 h2&#93;
atkfs&#91;a2&#93; &#91;b3&#93;
atkfs&#91;b2&#93; &#91;a3 c3&#93;
atkfs&#91;c2&#93; &#91;b3 d3&#93;
atkfs&#91;d2&#93; &#91;c3 e3&#93;
atkfs&#91;e2&#93; &#91;d3 f3&#93;
atkfs&#91;f2&#93; &#91;e3 g3&#93;
atkfs&#91;g2&#93; &#91;f3 h3&#93;
atkfs&#91;h2&#93; &#91;g3&#93;
atkfs&#91;a3&#93; &#91;&#93;
atkfs&#91;b3&#93; &#91;&#93;
atkfs&#91;c3&#93; &#91;&#93;
atkfs&#91;d3&#93; &#91;&#93;
atkfs&#91;e3&#93; &#91;&#93;
atkfs&#91;f3&#93; &#91;&#93;
atkfs&#91;g3&#93; &#91;&#93;
atkfs&#91;h3&#93; &#91;&#93;
atkfs&#91;a4&#93; &#91;&#93;
atkfs&#91;b4&#93; &#91;&#93;
atkfs&#91;c4&#93; &#91;&#93;
atkfs&#91;d4&#93; &#91;&#93;
atkfs&#91;e4&#93; &#91;&#93;
atkfs&#91;f4&#93; &#91;&#93;
atkfs&#91;g4&#93; &#91;&#93;
atkfs&#91;h4&#93; &#91;&#93;
atkfs&#91;a5&#93; &#91;&#93;
atkfs&#91;b5&#93; &#91;&#93;
atkfs&#91;c5&#93; &#91;&#93;
atkfs&#91;d5&#93; &#91;&#93;
atkfs&#91;e5&#93; &#91;&#93;
atkfs&#91;f5&#93; &#91;&#93;
atkfs&#91;g5&#93; &#91;&#93;
atkfs&#91;h5&#93; &#91;&#93;
atkfs&#91;a6&#93; &#91;&#93;
atkfs&#91;b6&#93; &#91;&#93;
atkfs&#91;c6&#93; &#91;&#93;
atkfs&#91;d6&#93; &#91;&#93;
atkfs&#91;e6&#93; &#91;&#93;
atkfs&#91;f6&#93; &#91;&#93;
atkfs&#91;g6&#93; &#91;&#93;
atkfs&#91;h6&#93; &#91;&#93;
atkfs&#91;a7&#93; &#91;b6&#93;
atkfs&#91;b7&#93; &#91;a6 c6&#93;
atkfs&#91;c7&#93; &#91;b6 d6&#93;
atkfs&#91;d7&#93; &#91;c6 e6&#93;
atkfs&#91;e7&#93; &#91;d6 f6&#93;
atkfs&#91;f7&#93; &#91;e6 g6&#93;
atkfs&#91;g7&#93; &#91;f6 h6&#93;
atkfs&#91;h7&#93; &#91;g6&#93;
atkfs&#91;a8&#93; &#91;a7 b8&#93;
atkfs&#91;b8&#93; &#91;a6 c6 d7&#93;
atkfs&#91;c8&#93; &#91;b7 d7&#93;
atkfs&#91;d8&#93; &#91;c7 d7 e7 c8 e8&#93;
atkfs&#91;e8&#93; &#91;d7 e7 f7 d8 f8&#93;
atkfs&#91;f8&#93; &#91;e7 g7&#93;
atkfs&#91;g8&#93; &#91;f6 h6 e7&#93;
atkfs&#91;h8&#93; &#91;h7 g8&#93;
atkts&#91;a1&#93; &#91;&#93;
atkts&#91;b1&#93; &#91;a1&#93;
atkts&#91;c1&#93; &#91;d1&#93;
atkts&#91;d1&#93; &#91;e1&#93;
atkts&#91;e1&#93; &#91;d1&#93;
atkts&#91;f1&#93; &#91;e1&#93;
atkts&#91;g1&#93; &#91;h1&#93;
atkts&#91;h1&#93; &#91;&#93;
atkts&#91;a2&#93; &#91;a1&#93;
atkts&#91;b2&#93; &#91;c1&#93;
atkts&#91;c2&#93; &#91;d1&#93;
atkts&#91;d2&#93; &#91;b1 c1 d1 e1&#93;
atkts&#91;e2&#93; &#91;d1 e1 f1 g1&#93;
atkts&#91;f2&#93; &#91;e1&#93;
atkts&#91;g2&#93; &#91;f1&#93;
atkts&#91;h2&#93; &#91;h1&#93;
atkts&#91;a3&#93; &#91;b1 b2&#93;
atkts&#91;b3&#93; &#91;a2 c2&#93;
atkts&#91;c3&#93; &#91;b1 b2 d2&#93;
atkts&#91;d3&#93; &#91;c2 e2&#93;
atkts&#91;e3&#93; &#91;d2 f2&#93;
atkts&#91;f3&#93; &#91;g1 e2 g2&#93;
atkts&#91;g3&#93; &#91;f2 h2&#93;
atkts&#91;h3&#93; &#91;g1 g2&#93;
atkts&#91;a4&#93; &#91;&#93;
atkts&#91;b4&#93; &#91;&#93;
atkts&#91;c4&#93; &#91;&#93;
atkts&#91;d4&#93; &#91;&#93;
atkts&#91;e4&#93; &#91;&#93;
atkts&#91;f4&#93; &#91;&#93;
atkts&#91;g4&#93; &#91;&#93;
atkts&#91;h4&#93; &#91;&#93;
atkts&#91;a5&#93; &#91;&#93;
atkts&#91;b5&#93; &#91;&#93;
atkts&#91;c5&#93; &#91;&#93;
atkts&#91;d5&#93; &#91;&#93;
atkts&#91;e5&#93; &#91;&#93;
atkts&#91;f5&#93; &#91;&#93;
atkts&#91;g5&#93; &#91;&#93;
atkts&#91;h5&#93; &#91;&#93;
atkts&#91;a6&#93; &#91;b7 b8&#93;
atkts&#91;b6&#93; &#91;a7 c7&#93;
atkts&#91;c6&#93; &#91;b7 d7 b8&#93;
atkts&#91;d6&#93; &#91;c7 e7&#93;
atkts&#91;e6&#93; &#91;d7 f7&#93;
atkts&#91;f6&#93; &#91;e7 g7 g8&#93;
atkts&#91;g6&#93; &#91;f7 h7&#93;
atkts&#91;h6&#93; &#91;g7 g8&#93;
atkts&#91;a7&#93; &#91;a8&#93;
atkts&#91;b7&#93; &#91;c8&#93;
atkts&#91;c7&#93; &#91;d8&#93;
atkts&#91;d7&#93; &#91;b8 c8 d8 e8&#93;
atkts&#91;e7&#93; &#91;d8 e8 f8 g8&#93;
atkts&#91;f7&#93; &#91;e8&#93;
atkts&#91;g7&#93; &#91;f8&#93;
atkts&#91;h7&#93; &#91;h8&#93;
atkts&#91;a8&#93; &#91;&#93;
atkts&#91;b8&#93; &#91;a8&#93;
atkts&#91;c8&#93; &#91;d8&#93;
atkts&#91;d8&#93; &#91;e8&#93;
atkts&#91;e8&#93; &#91;d8&#93;
atkts&#91;f8&#93; &#91;e8&#93;
atkts&#91;g8&#93; &#91;h8&#93;
atkts&#91;h8&#93; &#91;&#93;

Aleks Peshkov · Post by **Aleks Peshkov** » Tue Jan 31, 2012 2:58 pm

Incremental PST update is more clean and easier solution, because you have to do similar incremental update of Zobrist keys each move anyway. It is even possible to do PST and hash key update together inside the same SSE register (practically for free).

Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs

Re: Incremental or non-incremental PST evaluation calcs