Hashing and mate scores

hgm · Post by **hgm** » Mon Oct 01, 2007 11:25 pm

Gerd Isenberg wrote:HG - probably using one recursive search-routine for both search and qsearch - needs the additional conditions always - even if cheap and almost 100% predictable none taken.

Indeed, I use the same routine for QS and main search in both uMax and Joker. They would be so much alike, that the few places in which they differ are most efficiently handled by a few if-statements (like when deciding to generate all moves or just captures).

Such ifs actually save time, as 'unrolling' the code to eliminate them, duplicating the 99.9% that the two routines have in common, puts an enormous extra burden on the instruction cache. The resulting L1 misses take orders of magnitude more time than the 3 or 4 ifs in strategic places.

Of course if I use that first cutoff test also in QS for stand-pat cutoff, it no longer will be 100% predictable.

Gerd Isenberg · Post by **Gerd Isenberg** » Tue Oct 02, 2007 1:20 am

hgm wrote: Indeed, I use the same routine for QS and main search in both uMax and Joker. They would be so much alike, that the few places in which they differ are most efficiently handled by a few if-statements (like when deciding to generate all moves or just captures).

Such ifs actually save time, as 'unrolling' the code to eliminate them, duplicating the 99.9% that the two routines have in common, puts an enormous extra burden on the instruction cache. The resulting L1 misses take orders of magnitude more time than the 3 or 4 ifs in strategic places.

Ok, that might be all true for uMax with an "inlined" evaluation. Anyway your 99% figure seems a bit optimistic to me. Other programs with more code may differ - depending on their conditional stuff in mainsearch if depth == 1,2,3,4..., like null-move, smp-tree-splitting stuff, extension and pruning-decisions, re-searches, IID, multi-cut etc. I guess in most programs search-code needed for depth <= 0 is more in a 50 percent code size range of the main search. It might therfor have a better chance to "survive" in L1 after calling huge eval. Qsearch is still likely > 50% of _all_ nodes, in some positions up to 70% or even more considering not probing hash. It has likely less local variables as well as actual parameters, less stack traffic which may pay off in deep qsearches.

I mean if you can make a separate qsearch 10% faster on the cost of making main search 5% slower, what do you choose?

Chan Rasjid · Post by **Chan Rasjid** » Tue Oct 02, 2007 2:54 am

OK,

The "silly" 10000 - D stuff works and seems _not_most_silly. I started with TSCP and so inherited returning -inf + ply on checkmate and never think twice. Probably when I implemented hashing(almost all by trial and error) and found checkmate bugs, I might have just added the _distance_to_mate stuff that made it ok.

I have yet to confirm that Muller's codes work (all analysis provisional for now) as when someone posts pseudo codes, there is bound to have bugs that readers are suppose to easily detect! Muller's codes "muck around" with alpha / beta and that makes it very tricky. Pradu had to ask "should it not be Alpha = Alpha + 1?". I sense that his pseudo codes have bugs that need to be fixed.

Because Gerd confirmed that Rebel does a ply-independent mate scoring, I tested a real simple such version with TSCP and it seems to work. Very likely it should cause no hashing problems by simply hashing bestscore as usual. The result of some games "seems" ok, no strange behavior.

tscp - black, umax4- 8 white:-
twice, result 0-0, 0-0
tscp - white, umax4- 8 black:-
twice result 1-0, 1-0 tscp won.
tscp - white, warrior_xxx black:-
result 1-0, tscp won.
tscp - white, nano black:-
result 0-1, tscp loss.

Pseudo code changes to TSCP search(). Without need to shift (alpha, beta) left as in Muller's if all we want is only about mate scores :-

Code: Select all

#define MAX_PLY  32
#define MATESCORES  10000 - 15

int search(depth, alpha, beta){
 //etc ...
 BOOL f = False; 
 for (moves){
  if (!makemove()) continue;
  f = True;
  score = -search(depth - 1, -beta, -alpha);
  takeback();
  if (score <= MATESCORES){
    score = score + 1;
  }
  if (score > alpha){
    if (score >= beta ) return beta;
    alpha = score;
  } 	
 } 
  if (f == False and in_check) return -10000;
  //etc ...  

  return alpha;
}

Rasjid

Chan Rasjid · Post by **Chan Rasjid** » Tue Oct 02, 2007 3:57 am

A slight bug, should be :-
if (score <= -MATESCORES){
score = score + 1;
}
and NOT score <= MATESCORES

Code: Select all

#define MAX_PLY  32
#define MATESCORES  10000 - 15

int search(depth, alpha, beta){
 //etc ...
 BOOL f = False;
 for (moves){
  if (!makemove()) continue;
  f = True;
  score = -search(depth - 1, -beta, -alpha);
  takeback();
  /*edited to score <= -MATESCORES */
  if (score <= -MATESCORES){
      score = score + 1;
  }
  if (score > alpha){
    if (score >= beta ) return beta;
    alpha = score;
  }    
 }
  if (f == False and in_check) return -10000;
  //etc ... 

  return alpha;
}

Pradu · Post by **Pradu** » Tue Oct 02, 2007 7:20 am

Chan Rasjid wrote:Pradu had to ask "should it not be Alpha = Alpha + 1?". I sense that his pseudo codes have bugs that need to be fixed.

Indeed I think there are some bugs if my understanding of HG's technique (as I explain here) is not mistaken (however I think it could mistaken

):

Basically HG's idea is to delay a loss. In our case a player is lost if the score to return is a negative mate. I guess we can just the define our guy loosing as #define IS_LOST(score) (score<=-MATE_MIN) where MATE_MIN could be something like MATE-1000. To delay the loss, whenever you return a lost score you increase it's value by 1 before you return it. This can work because if the loss is further away from the root, the lost score will be greater because you will have more returns where you add 1 to the lost score. Say when you have a mate, you do return -MATE;. Here's an example of a mate score propagating backwards:

Code: Select all

When mated, your original score is -MATE.  Then when you add one to it it becomes -MATE+1 or -(MATE-1)
returns -MATE
gets MATE returns MATE
gets -MATE returns -(MATE-1)
gets MATE-1 returns MATE-1
gets -(MATE-1) returns -(MATE-2)
gets MATE-2 returns MATE-2
gets -(MATE-2) returns -(MATE-3)
gets MATE-3 nothing to return <root>

So if from the root you had searched 7 ply and found had a mate in 4, the "mucked" score back to the root will be MATE-3. And if a quicker mate exists, say a mate in 2, then the score "mucked" scare back to the root will be MATE-1. A mate in 1 would of course be -MATE. If you get mated in 2 plys (get mated in 1) then the score you get at the root would be -MATE. The method should work fine as it is so far; even without modifying any code for your hashtable storing algorithm. However where you store and what you store will be tricky. You have to keep track of whether you store the score you got or the score you muck (the score you will return) and be consistent throughout. Ok that's my understanding of this delayed loss stuff, but what is the purpose of the strange code

Code: Select all

if(alpha <= -MATE_MIN) alpha-=1;
if(beta < -MATE_MIN) beta-=1;

I assumed HG was trying to do something similar to mate-distance pruning here because there seems to be no other reasonable explanation I can think of. However, the bounds should get smaller, not bigger if you are trying to do this, so just the code alpha-=1 doesn't make sense to me at the moment. Hopefully HG can clear up some things for us.

hgm · Post by **hgm** » Tue Oct 02, 2007 12:23 pm

One has to be very careful when adjusting scores returned from an alpha-beta search, or there is no limit to how stupid moves the program will play. (I have seen it play moves that got it mated-in-1, with positive score, as the mating move was unjustly pruned...)

In particular, what should never be able to happen is that a move that was returned by the daughter as out-of-window (i.e. out of the window specified to the daughter) and is thus an uUPPER or LOWER bound, would get back inside the window (of the node doing the adjustment). For in that case the node, and any nodes below it, would treat it like an EXACT score, with disastrous errors as a consequence.

So if I change a returned score by adding +1, I must shift the window for searching that score in the _opposite_ direction, by subtracting 1. That way, if the search returns Alpha (=OriginalAlpha-1), i.e. just failed low, it would be adjusted to Alpha+1 = OriginalAlpha, so that we still just fail low. If the post-search adjustment is made conditionally (which is the reason for doing it post-search), the adjustment of Alpha can be made conditionally as well.

Some examples (MATINGSCORES=9900):

{Alpha,Beta} = {-9998,12}
The following adjustments must be anticipated:
-9999 -> -9998 (fail low)
-9998 -> -9997 (exact)
11 -> 11 (exact)
12 -> 12 (fail high)
We must make sure that all (pre-adjusted) scores from -9998 to 11 are exact. So the move originally has to be searched with window {-9999,12}. I.e. Alpha has to be _decreased_ by 1.

{Alpha,Beta} = {-9998,-9995}
-9999 -> -9998 (fail low)
-9998 -> -9997 (exact)
-9997 -> -9996 (exact)
-9996 -> -9995 (fail high)
So pre-adjusted scores from -9998 and -9997 should be exact, i.e. the window for obtaining those scores should be {-9999,-9996}

Now the precise boundary between mating scores and Eval scores is chosen such that it should never be closely approached. But if you are doing this not for mating scores only, but for any delayed loss, you can get close to the boundary. e.g.
{Alpha,Beta}={4,5}, CurEval=5
3 -> 4 (fail low)
4 -> 5 (fail high)
5 -> 5 (fail high)
The null window for the daughter should thus be set to {3,4}. The pseudo-code contained indeed a (non-fatal) error: Beta can be pre-adjusted even if it is equal to CurEval.

{Alpha,Beta}={4,5}, CurEval=4
3 -> 4 (fail low)
4 -> 4 (fail low)
5 -> 5 (fail high)
The window for the daughters should be pre-adjusted to {4,5}, i.e. not adjusted at all. Alpha should not be adjusted if it is equal to CurEval. I accidentally swapped the < and <= tests in the pseudo-code. It should have read:

Code: Select all

if(Alpha < CurEval) Alpha--;
if(Beta <= CurEval) Beta--;

In general, if at the end of Search() we apply a function f to the BestScore, i.e. we return f(BestScore), we should apply f-inverse to the window bounds beforehand. This applies when the score is a continuous variable. If there is some discretization that forces us to round f-inverse(WindowBound) to expand the window. E.g. if the post-search adjustment would multiply negative scores by 31/32, we would have to divide Alpha and Beta by 31/32, i.e. multiply it by 32/31. Now for integers this is never exact, so we should do it in such a way that Alpha is rounded down, and Beta rounded up. If we will ever round the wrong way, or think that adding 1/31 is about the same as adding 1/32 to Beta, fatal errors will result.

Chan Rasjid · Post by **Chan Rasjid** » Tue Oct 02, 2007 5:25 pm

Hello Pradu,

... The bounds updating for pruning out lines that cannot possibly lead to shorter mates can be found in Fruit where Fabien calls it "mate-distance pruning".

I think I now know what it means and probably what Gerd mentioned about "avoids unnecessary searches with mate bounds ". I dont mean I tried to understand Fruits code, too tough yet. Maybe Muller made things more complicated!

I think it is like this. In my usual full_search(), there is never a need for mate-distance pruning as I have nothing to prune -just return whenever a side has a winning mate score! Better then shifting (alpha, beta) left and getting confused!Hashed as lower, but sometimes exact. When root finally has a winning mate score it calls search_better_mate(). It uses simple alphabeta with full window (alpha, infi) and iterates by steps of 2, no pvs, null window, researches, null move, eval(). Done once an never ever need to debug.

I think mate-distance pruning means simply not to search a move if it will never propagate to at least a next better mate at root. It should work also if we use ply-independent mate scoring, but I'll explain with the -infi + ply method :-
When root has a winning mate it is of the form
infi - 1, infi - 3,infi - 5,infi - 7,....
we iterate with double stepping with (infi - odd, infi) and for nodes up the tree, it is easy to know if we need to make() and search the move.
if the PLY is such that if:-
1)the move gives check - when the best possible mate score will propagate to at least a next better mate at root - we search and the next node need only to know if it has a valid move. If WITH[edited] a valid it simply fails high.
2) the move does not give check - then it is pruned and we look for the next move that checks.etc...

I "guess" this may be like Fruit's with a flag search-shortest-mate = False at root but set to True
when winning mate score found at root.

Best Regards,
Rasjid

Pradu · Post by **Pradu** » Tue Oct 02, 2007 6:09 pm

Chan Rasjid wrote:I think I now know what it means and probably what Gerd mentioned about "avoids unnecessary searches with mate bounds ". I dont mean I tried to understand Fruits code, too tough yet. Maybe Muller made things more complicated!

I think I understand HG's intent with the bounds updating now. It is not about mate-distance pruning (my mistake) but instead about adjusting the bounds to counter the effect of "mucking" the mate scores backwards: Say you are at the root and search one move and get back a score -(MATE-n). Then say you search the second move, but all replies to that move return (MATE+n+1). A score -(MATE-n) at ply 0 is not the same as -(MATE-n) at ply 2. Infact -(MATE-n) at ply 2 is better because it will get "mucked" back up to -(MATE-n-1). But if you don't do bounds updating the -(MATE-n) will fail low at ply 2 and fail high at ply 1 which you don't want it to do. So, you have to increase the bounds to counteract the effect of "mucking" and that's what that portion of code HG has written does.

hgm · Post by **hgm** » Tue Oct 02, 2007 6:34 pm

Indeed. One should always remember that the window bounds alpha and beta are scores of earlier moves that were branching off from the current branch. What alpha-beta pruning does is compare the score at the current node deeper in the branch to the score of such an earlier move, to get a prognosis for which of the two will eventually be chosen when the current score has propagated back to this side branch.

So the beta-cutoff

Code: Select all

if(BestSCore' >= Beta') goto cutoff;

is in fact a replica if the

Code: Select all

if(Score > BestScore) BestScore = Score;

in some node closer to the root. The Beta' is the BestScore passed up the tree, while the Score will be the BestScore' passed down the tree.

You must make sure that these comparisons always give the same result. So the mappings

BestScore -> Beta'
Score -> BestScore'

should be identical. In practice the flow of information goes from

BestScore' -> Score,

though, as the score propagates towards the root. The Score must therefore undergo the inverse transformation propagating towards the root as the Beta undergoes propagating up the tree. Only then the cutoff test is equivalent to the later move picking. And if it is not equivalent, you run the risk of missing a cutoff that could have been done, or, much worse, cut off a move that should have been searched.

Chan Rasjid · Post by **Chan Rasjid** » Wed Oct 03, 2007 12:47 pm

Pradu wrote:-

I think I understand HG's intent with the bounds updating now. It is not about mate-distance pruning (my mistake) but instead about adjusting the bounds to counter the effect of "mucking" the mate scores backwards: Say you are at the root and search one move and get back a score -(MATE-n). Then say you search the second move, but all replies to that move return (MATE+n+1). A score -(MATE-n) at ply 0 is not the same as -(MATE-n) at ply 2. Infact -(MATE-n) at ply 2 is better because it will get "mucked" back up to -(MATE-n-1). But if you don't do bounds updating the -(MATE-n) will fail low at ply 2 and fail high at ply 1 which you don't want it to do. So, you have to increase the bounds to counteract the effect of "mucking" and that's what that portion of code HG has written does.

Since you mentioned you now understand, then Muller is probably not practicing voodoo alpha-beta.
I cannot yet follow your argument as I am fasting, but I vaguely see something interesting.

Alpha-beta searches from a node and whenever a better score is found, we search subsequent moves by passing a bound (alpha, beta) with an updated alpha. Finally we finish and say "...we have found a best-score = alpha...". But this may be after few changes in the bounds passed backed up the tree. Now we do strange things, we create a function fn(integer) and passed up (fn(alpha), beta) instead of (alpha, beta) and if we don't inform the nodes up the tree about it that "there is a change of axis", we don't know if when they work and pass you back information, they compare numbers from a space that includes all numbers they had used previously without adjusting for a "change of axis"... I think this is what Muller mention about the need to know of an inv-fn() = inverse of fn() and for it to be applied to alpha when necessary. My model from instinct :-

Code: Select all

for relevant ranges of the integer domain:-
fn(x) = x +1;
inv-fn(x) = x -1;

int search(alpha, beta){
 alpha = inv-fn(alpha);
 beta = inv-fn(beta);

  for (moves){
   alpha = fn(alpha);
   beta = fn(beta);
   score = -search(-beta, -alpha);
   //etc...
  }
  
  return whatever;
}

Is it the same as Muller's?

And Muller recommends this to a novice new to alpha-beta! Just propose the silly 10000 - D stuff which cannot have problems. Also is it certain those who use the ply-independent mate scoring do it correctly?

Rasjid

Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores

Re: Hashing and mate scores