Help please with alpha-beta

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Henk
Posts: 7210
Joined: Mon May 27, 2013 10:31 am

Re: Help please with alpha-beta

Post by Henk »

Henk wrote:
Alexandr Novikov wrote:
Henk wrote:See second part of previous pos. I think you can find the difference yourself.
ok, but where alpha beta in your code? sorry, I don't understand
search(lb, ub) is my interpretation of search(alpha, beta)
It is a bad interpretation. Ask somebody else I have no patience. I have other things to do.
Last edited by Henk on Wed Sep 16, 2015 9:24 pm, edited 1 time in total.
User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help please with alpha-beta

Post by hgm »

Basically alpha-beta pruning is that you don't try to impove best_score2 (say) anymore (where 'improve' means making it lower) when it is already smaller than best_score1, because it can only get lower, and it is already low enough not to matter on the level below.
//--------------------------------------------------------------------------------------
procedure search;
var .....
black_move_generator;
for i:=0 to bcnt - 1; //all possible black moves
make_move;
first_ply; //call next iteration
if i = 0 then best_score := best_score1 else
if best_score1 < best_score then best_score := best_score1; //min
unmake_move;
//--------------------------------------------------------------------------------------
procedure two_ply;
var .....
white_move_generator;
for i:=0 to wcnt - 1; //all possible white moves
make_move;
tree_ply; //call next iteration
if i = 0 then best_score1 := best_score2 else
if best_score2 > best_score1 then best_score1 := best_score2; //max
unmake_move;
if best_score1 > best_score then i := wcnt; // trick to terminate for loop
//-------------------------------------------------------------------------------------- procedure tree_ply;
var .....
black_move_generator;
for i:=0 to bcnt - 1; //all possible black moves
make_move;
four_ply; //call next iteration
if i = 0 then best_score2 := best_score3 else
if best_score3 < best_score2 then best_score2 := best_score3; // min
unmake_move;
if best_score2 < best_score1 then i := bcnt; // trick to terminate for loop
//--------------------------------------------------------------------------------------procedure four_ply;
var .....
white_move_generator;
for i:=0 to wcnt - 1; //all possible white moves
make_move;
five_ply; //call next iteration
if i = 0 then best_score3 := best_score4 else
if best_score4 > best_score3 then best_score3 := best_score4; //max
unmake_move;
if best_score3 > best_score2 then i := wcnt; // trick to terminate for loop
//--------------------------------------------------------------------------------------procedure five_ply;
var .....
black_move_generator;
for i:=0 to bcnt - 1; //all possible black moves
make_move;
evaluate; //call evaluate function
if i = 0 then best_score4 := eval else
if eval < best_score4 then best_score4 := eval; // min
unmake_move;
if best_score4 < best_score3 then i := bcnt; // trick to terminate for loop
//--------------------------------------------------------------------------------------

But the first thing I would worry about if I were you is to condense all the different-but-nearly-the-same code into one. With a different routine for every ply level the code will grow unmaintainable pretty quickly. You have five ply now, but with alpha-beta working that could increase to 10 ply. And then every improvement you want to make to the routine would have to be done in 10-fold. That is asking for bugs.

One bad aspect of your code is that you use global variables best_scoreN, for which you then also need to make 5 (or 10) versions. You probably do this because you want to communicate the final value of that variable to the procedure one ply below it. But Pascal has a much better way to communicate a value to the procedure that called the current one: make it a function instead of a procedure.

So a first step would be to make all the best_scoreN variables local variables of the procedures, and then return their value as a function:

Code: Select all

function two_ply&#58; integer;
var score, best_score&#58; integer; .....
white_move_generator;
for i&#58;=0 to wcnt - 1; //all possible white moves
make_move;
score &#58;= three_ply; //call next iteration
if i = 0 then best_score &#58;= score else
if score > best_score then best_score &#58;= score; //max
unmake_move;
two_ply &#58;= best_score;
This would still have all the levels handled by separate functions, but the functions would already look a lot more like each other. The main remaining difference is the white/black issue, between the even and odd levels.

One of the more annoying difference is in the comparison of score and best_score, which on white (even) levels is >, but on black levels is <. This difference can be erased by not using absolute scores (from white's point of view), but relative scores, from the side to move's point of view. Because when a > b, then -a < -b, so the comparison operator flips too. When transferring scores between levels you should then flip the sign. This would make all levels use >, and makes them look like

Code: Select all

function two_ply&#58; integer;
var score, best_score&#58; integer; .....
white_move_generator;
for i&#58;=0 to wcnt - 1; //all possible white moves
make_move;
score &#58;= -three_ply; //call next iteration *** note the sign flip! ***
if i = 0 then best_score &#58;= score else
if score > best_score then best_score &#58;= score; //max
unmake_move;
two_ply &#58;= best_score;
The only difference now is wheher you call the white or black move generator, and whether the loop uses wcnt or bcnt (which in your code were presumably global variables, set by the move generator). You could remove the need for global variables by making the move generator a function that returns the number of generated moves, instead of leaving that in a global variable:

Code: Select all

function two_ply&#58; integer;
var score, best_score, cnt&#58; integer; .....
cnt &#58;= white_move_generator;
for i&#58;=0 to cnt - 1; //all possible white moves
make_move;
score &#58;= -three_ply; //call next iteration *** note the sign flip! ***
if i = 0 then best_score &#58;= score else
if score > best_score then best_score &#58;= score; //max
unmake_move;
two_ply &#58;= best_score;
Finally you could make the function capable of calling both the white and black move generator, and introduce a variable stm ('side to move') that tells it which one to use:

Code: Select all

function two_ply&#58; integer;
var score, best_score, cnt&#58; integer; .....
if stm = 1 then
cnt &#58;= white_move_generator
else // stm must be -1
cnt &#58;= black_move_generator;
for i&#58;=0 to cnt - 1; //all possible stm moves
make_move; stm &#58;= -stm; // flip side to move
score &#58;= -three_ply; //call next iteration *** note the sign flip! ***
if i = 0 then best_score &#58;= score else
if score > best_score then best_score &#58;= score; //max
unmake_move; stm &#58;= -stm; // flip side to move
two_ply &#58;= best_score;
That makes all routines the same, except for the last one, where you do evaluate; Because they all are the same, you could make it call itself, rather than to have multiple counters. But that would go on forever. So you haveto tell them how often they have to go on calling a next level before doing evaluate. You can tell that to them in a parameter:

Code: Select all

function any_ply&#40;var depth&#58; integer&#41;&#58; integer;
var score, best_score, cnt&#58; integer; .....
if stm = 1 then
cnt &#58;= white_move_generator
else // stm must be -1
cnt &#58;= black_move_generator;
for i&#58;=0 to cnt - 1; //all possible stm moves
  make_move; stm &#58;= -stm; // flip side to move
  if depth = 0 then 
    eval &#58;= evaluate
  else
    score &#58;= -any_ply&#40;depth-1&#41;; //call next iteration *** note the sign flip! ***
  if i = 0 then best_score &#58;= start_score else
  if score > best_score then best_score &#58;= score; //max
  unmake_move; stm &#58;= -stm; // flip side to move
any_ply &#58;= best_score;
You can then call any_ply(5) if you want to search 5 ply deep, but easily change that to any other number of ply without having to clone any code.
Alexandr Novikov
Posts: 22
Joined: Tue Sep 15, 2015 8:21 pm

Re: Help please with alpha-beta

Post by Alexandr Novikov »

H.G.Muller, many thanks
in principle, I understand the whole, but one question for you.
What you mean in next string:
if best_score1 > best_score then i := wcnt; // trick to terminate for loop
I want to ask you where here the condition to stop search
User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help please with alpha-beta

Post by hgm »

Well, if i is set to the wcnt to for-loop that loops over the moves will have reached the end., as it would only run to i=wcnt-1, and stop as soon as i gets larger than that. So setting i (the move number) to wnct after you found a move that is good enough to refute the opponent's preceding move is a trick to search no more moves in that node.
Henk
Posts: 7210
Joined: Mon May 27, 2013 10:31 am

Re: Help please with alpha-beta

Post by Henk »

bob wrote:
Henk wrote:First do you understand this ?

upperbound(white to move ) = -lowerbound(black to move)
value( white to move) = - value (black to move )


I use something like this:

Code: Select all

search &#40;lb, ub&#41;
&#123;
   val = -infinite;
    for each possible move
    &#123;
       make move -> changes also side to move
        val = max &#40;val, -search&#40;-ub, -lb&#41;);
       undo  move -> changes also side to move
    &#125;
     return val
&#125;
So NO alpha/beta at all? Pure minimax???

Code: Select all

search &#40;d, lb, ub&#41;
&#123;
   if &#40;d = 0&#41; return qSearch&#40;lb, ub&#41; or stop if eval&#40;) >= ub
   val = -infinite;
    for each possible move
    &#123;
       make move -> changes also side to move
        val = max &#40;val, -search&#40;d-1, -ub, -lb&#41;); 
       undo  move -> changes also side to move
       if &#40;val >= ub&#41; break; // stop it; why search for a better lower bound ?
       lb = max&#40;lb, val&#41;; // skip nodes with  value <= val. Why search for them?
    &#125;
     return val
&#125;
Last edited by Henk on Thu Sep 17, 2015 10:34 am, edited 2 times in total.
Alexandr Novikov
Posts: 22
Joined: Tue Sep 15, 2015 8:21 pm

Re: Help please with alpha-beta

Post by Alexandr Novikov »

if you are not hard, bring a sample of code how to do it.
I tried to do the following:
if (i > 0) and (best_score4 < best_score3) then exit; (for five_ply)
but it not work correctly)
Dann Corbit
Posts: 12482
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Help please with alpha-beta

Post by Dann Corbit »

Can you post what you wrote here?

The simplest way to program it is using recursion.

It looks to me like you are trying to telescope expand it.
That would be hard to get right.
Henk
Posts: 7210
Joined: Mon May 27, 2013 10:31 am

Re: Help please with alpha-beta

Post by Henk »

Yes maybe first question " do you understand recursive programming" otherwise use a stack.
Henk
Posts: 7210
Joined: Mon May 27, 2013 10:31 am

Re: Help please with alpha-beta

Post by Henk »

[If you do no recursion and use a stack: add something like if d <= 3 and copy and paste all code within de loop three times. One for d = 3, d = 2, depth = 1. ]

Don't see any added value (asset) for code will be badly maintainable. Maybe (small) speed improvement.
User avatar
hgm
Posts: 27703
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help please with alpha-beta

Post by hgm »

So what you have now is this?

Code: Select all

procedure five_ply;
var .....
black_move_generator;
for i&#58;=0 to bcnt - 1; //all possible black moves
  make_move;
  evaluate; //call evaluate function
  if i = 0 then best_score4 &#58;= eval else
  if eval < best_score4 then best_score4 &#58;= eval; // min
  unmake_move;
  if &#40;i > 0&#41; and &#40;best_score4 < best_score3&#41; then exit; // best_score4 is low enough
I think that should work: in the parent node (four_ply) there is the test if best_score4 > best_score3, and once best_score4 is already smaller, you are sure this test is going to fail, as the value of best_score4 can only go down further. The i>0 is not even needed

I do now see a problem, however, which is in the five_ply reply to the first move of four_ply. best_score3 has not received a value then in four_ply. So it would be better to always set a value to best_scoreN before you start searching any moves, so that the daughter node can always rely on the value being available. Like:

Code: Select all

procedure four_ply;
var .....
black_move_generator;
best_score3 = -infinite;
for i&#58;=0 to bcnt - 1; //all possible black moves
  make_move;
  five_ply;
  if best_score4 > best_score3 then best_score3 &#58;= best_score4; // min
  unmake_move;
  if best_score3 >= best_score2 then exit; // best_score3 is low high enough
Here 'infinite' is a number so large that you would never need it as an evaluation score, e.g. 8000 if you evaluate in centi-Pawns. Note that the

Code: Select all

if i = 0 then best_score3 &#58;= best_score4 else
then has become superfluous, as the else-part would automatically do this on the first move, as when best_score3 = -infinite, best_score4 would always be larger than it.

In fact the code I gave here does not fully implement alpha-beta pruning, although it should already give an enormous speedup. Best would be to set best_score3 initially not to -infinite, but to best_score1. At first glance that seems very wrong: how can we claim the best_score is anything better than infinitely bad if we have not even searched a single move? All moves could easily be worse than best_score1.

The reason we can do this is that when best_score3 would in reality be lower than (or equal to) best_score1, it is not really important how much lower. All it means is that in the level below the test if best_score3 < best_score2 then ... will succeed, and will set best_score2 to best_score1. Even if this really should have been something lower, it is already (just) enough to trigger the test if best_score2 <= best_score1 then exit; there. So that you immediately will terminate that node too, and end up 2 levels lower (in two_ply), where the test if best_score2 > best_score1 then ... will certainly fail, as the move there will be considered refuted for any score <= best_score1.

Another way of saying this is that best_score at all odd levels will never be lower than the score of the best move you have found at all the odd levels below it. It will be the score of your best variation found so far. And if you cannot score above that in the current node, you know you are going to deviate from the variation leading to the current node by playing that better move. It does not matter how much worse the current node is, just that it is bad enough that you don't want to go there.