OK, I am using Bruce Moreland's alpha-beta page as an guide, and I am trying to get a clear and stable picture in my mind of how is program might implement the algorithm he presents. here's the link again...
http://web.archive.org/web/200404200223 ... habeta.htm
ok first I will describe the scenario, and clarify some semantics...
There is a game between human and engine, and it is the engine's turn to move.
level0 describes the position after the human move, it consists of a single node (position).
level1 consists of n nodes, which are the positions which arise after each possible legal move that the engine might make "on the board"
I also use the terms search and alphaBeta somewhat interchangeably.
Now at this point I suppose we can consider that the problem in two ways...
1. Only one "external call" to search, in which we pass the level0 position to the search, and the search determines the level1 positions and somehow choses from them the final optimal move to be made on the board.
2. Outside of search we have some kind of non-recursive makeBestMove() method. It will determine the n legal moves (level1), then iterate through them and make n external calls to search. Each call will return an eval of the position after the move. And on the basis of these n evals the makeBestMove() method will choose the best move.
OK, here's the rub...
Moreland's alphabeta is an integer valued function, it returns an evaluation, as does every other pseudo-code alphabeta I have ever seen. That suggests to me method 2, n external calls to search, n evals get returned, then some other method chooses the level1 move with the maximum eval.
However, Moreland also describes an initial external call to search as val = AlphaBeta(5, -INFINITY, INFINITY);
Now that suggest to me method 1. It has to be, or else the fact that we are passing two constants (-INFINITY, INFINITY) as alpha and beta would mean we have lost out on a bg chunk of pruning. But if it is method 1, then surely the final optimal move is being determined by the alphaBeta() function, in which case it would be nice to see that represented in the pseudo-code, and it woyuld also be nice to see alphaBeta() return a move and not an eval.
Now there are a couple of ways I might resolve this discrepancy, but I feel that somewhere along the way I am unnecessarily complicating things. So how do people typically deal with this. Method 1 or method 2?
Interpreting Alpha-Beta pseudo-code
Moderators: hgm, Rebel, chrisw
-
- Posts: 814
- Joined: Sat May 09, 2009 4:51 pm
- Location: Toronto
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Interpreting Alpha-Beta pseudo-code
First, AlphaBeta() is a recursive function. Think of it like this:Fguy64 wrote:OK, I am using Bruce Moreland's alpha-beta page as an guide, and I am trying to get a clear and stable picture in my mind of how is program might implement the algorithm he presents. here's the link again...
http://web.archive.org/web/200404200223 ... habeta.htm
ok first I will describe the scenario, and clarify some semantics...
There is a game between human and engine, and it is the engine's turn to move.
level0 describes the position after the human move, it consists of a single node (position).
level1 consists of n nodes, which are the positions which arise after each possible legal move that the engine might make "on the board"
I also use the terms search and alphaBeta somewhat interchangeably.
Now at this point I suppose we can consider that the problem in two ways...
1. Only one "external call" to search, in which we pass the level0 position to the search, and the search determines the level1 positions and somehow choses from them the final optimal move to be made on the board.
2. Outside of search we have some kind of non-recursive makeBestMove() method. It will determine the n legal moves (level1), then iterate through them and make n external calls to search. Each call will return an eval of the position after the move. And on the basis of these n evals the makeBestMove() method will choose the best move.
OK, here's the rub...
Moreland's alphabeta is an integer valued function, it returns an evaluation, as does every other pseudo-code alphabeta I have ever seen. That suggests to me method 2, n external calls to search, n evals get returned, then some other method chooses the level1 move with the maximum eval.
However, Moreland also describes an initial external call to search as val = AlphaBeta(5, -INFINITY, INFINITY);
Now that suggest to me method 1. It has to be, or else the fact that we are passing two constants (-INFINITY, INFINITY) as alpha and beta would mean we have lost out on a bg chunk of pruning. But if it is method 1, then surely the final optimal move is being determined by the alphaBeta() function, in which case it would be nice to see that represented in the pseudo-code, and it woyuld also be nice to see alphaBeta() return a move and not an eval.
Now there are a couple of ways I might resolve this discrepancy, but I feel that somewhere along the way I am unnecessarily complicating things. So how do people typically deal with this. Method 1 or method 2?
It is called with a position to search, and a depth to search it to. From the root, we call alpha/beta with the root position and the nominal search depth. Once we get into AlphaBeta(), if depth is > 0, we make a move, and then call AlphaBeta() with the opposite side on move, the new position, and a depth that has been reduced by 1 ply. Whenever we start to call AlphaBeta() and we notice that the current depth is 1, we now call Quiesce() instead of AlphaBeta() (which really does the same thing but just generates fewer moves, namely just captures).
-
- Posts: 814
- Joined: Sat May 09, 2009 4:51 pm
- Location: Toronto
Re: Interpreting Alpha-Beta pseudo-code
OK that seems clear enough, thanks. It tells me one and only one "external call" to the recursive search function. It also tells me that just because alphaBeta() is depicted as returning an integer (alpha), it does not mean that the original "root-call" to alphaBeta will make any meaningful use of a returned integer.bob wrote:First, AlphaBeta() is a recursive function. Think of it like this:Fguy64 wrote:OK, I am using Bruce Moreland's alpha-beta page as an guide, and I am trying to get a clear and stable picture in my mind of how is program might implement the algorithm he presents. here's the link again...
http://web.archive.org/web/200404200223 ... habeta.htm
ok first I will describe the scenario, and clarify some semantics...
There is a game between human and engine, and it is the engine's turn to move.
level0 describes the position after the human move, it consists of a single node (position).
level1 consists of n nodes, which are the positions which arise after each possible legal move that the engine might make "on the board"
I also use the terms search and alphaBeta somewhat interchangeably.
Now at this point I suppose we can consider that the problem in two ways...
1. Only one "external call" to search, in which we pass the level0 position to the search, and the search determines the level1 positions and somehow choses from them the final optimal move to be made on the board.
2. Outside of search we have some kind of non-recursive makeBestMove() method. It will determine the n legal moves (level1), then iterate through them and make n external calls to search. Each call will return an eval of the position after the move. And on the basis of these n evals the makeBestMove() method will choose the best move.
OK, here's the rub...
Moreland's alphabeta is an integer valued function, it returns an evaluation, as does every other pseudo-code alphabeta I have ever seen. That suggests to me method 2, n external calls to search, n evals get returned, then some other method chooses the level1 move with the maximum eval.
However, Moreland also describes an initial external call to search as val = AlphaBeta(5, -INFINITY, INFINITY);
Now that suggest to me method 1. It has to be, or else the fact that we are passing two constants (-INFINITY, INFINITY) as alpha and beta would mean we have lost out on a bg chunk of pruning. But if it is method 1, then surely the final optimal move is being determined by the alphaBeta() function, in which case it would be nice to see that represented in the pseudo-code, and it woyuld also be nice to see alphaBeta() return a move and not an eval.
Now there are a couple of ways I might resolve this discrepancy, but I feel that somewhere along the way I am unnecessarily complicating things. So how do people typically deal with this. Method 1 or method 2?
It is called with a position to search, and a depth to search it to. From the root, we call alpha/beta with the root position and the nominal search depth. Once we get into AlphaBeta(), if depth is > 0, we make a move, and then call AlphaBeta() with the opposite side on move, the new position, and a depth that has been reduced by 1 ply. Whenever we start to call AlphaBeta() and we notice that the current depth is 1, we now call Quiesce() instead of AlphaBeta() (which really does the same thing but just generates fewer moves, namely just captures).
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: Interpreting Alpha-Beta pseudo-code
All that value is good for is you can display it and say "this is the score for the best move, based on the search I just completed."Fguy64 wrote:OK that seems clear enough, thanks. It tells me one and only one "external call" to the recursive search function. It also tells me that just because alphaBeta() is depicted as returning an integer (alpha), it does not mean that the original "root-call" to alphaBeta will make any meaningful use of a returned integer.bob wrote:First, AlphaBeta() is a recursive function. Think of it like this:Fguy64 wrote:OK, I am using Bruce Moreland's alpha-beta page as an guide, and I am trying to get a clear and stable picture in my mind of how is program might implement the algorithm he presents. here's the link again...
http://web.archive.org/web/200404200223 ... habeta.htm
ok first I will describe the scenario, and clarify some semantics...
There is a game between human and engine, and it is the engine's turn to move.
level0 describes the position after the human move, it consists of a single node (position).
level1 consists of n nodes, which are the positions which arise after each possible legal move that the engine might make "on the board"
I also use the terms search and alphaBeta somewhat interchangeably.
Now at this point I suppose we can consider that the problem in two ways...
1. Only one "external call" to search, in which we pass the level0 position to the search, and the search determines the level1 positions and somehow choses from them the final optimal move to be made on the board.
2. Outside of search we have some kind of non-recursive makeBestMove() method. It will determine the n legal moves (level1), then iterate through them and make n external calls to search. Each call will return an eval of the position after the move. And on the basis of these n evals the makeBestMove() method will choose the best move.
OK, here's the rub...
Moreland's alphabeta is an integer valued function, it returns an evaluation, as does every other pseudo-code alphabeta I have ever seen. That suggests to me method 2, n external calls to search, n evals get returned, then some other method chooses the level1 move with the maximum eval.
However, Moreland also describes an initial external call to search as val = AlphaBeta(5, -INFINITY, INFINITY);
Now that suggest to me method 1. It has to be, or else the fact that we are passing two constants (-INFINITY, INFINITY) as alpha and beta would mean we have lost out on a bg chunk of pruning. But if it is method 1, then surely the final optimal move is being determined by the alphaBeta() function, in which case it would be nice to see that represented in the pseudo-code, and it woyuld also be nice to see alphaBeta() return a move and not an eval.
Now there are a couple of ways I might resolve this discrepancy, but I feel that somewhere along the way I am unnecessarily complicating things. So how do people typically deal with this. Method 1 or method 2?
It is called with a position to search, and a depth to search it to. From the root, we call alpha/beta with the root position and the nominal search depth. Once we get into AlphaBeta(), if depth is > 0, we make a move, and then call AlphaBeta() with the opposite side on move, the new position, and a depth that has been reduced by 1 ply. Whenever we start to call AlphaBeta() and we notice that the current depth is 1, we now call Quiesce() instead of AlphaBeta() (which really does the same thing but just generates fewer moves, namely just captures).
-
- Posts: 147
- Joined: Wed Jun 06, 2007 10:01 am
- Location: United States
- Full name: Mike Leany
Re: Interpreting Alpha-Beta pseudo-code
Both methods work fine. Eventually you will probably prefer method 2 because it lets you do special things in the root position, but at this point, I think it's best for you to pick whichever method you find easiest to understand and deal with.
-
- Posts: 814
- Joined: Sat May 09, 2009 4:51 pm
- Location: Toronto
Re: Interpreting Alpha-Beta pseudo-code
well intuitively I can't see why method two wouldn't work, it seems reasonable, but something was going wrong, I just couldn't see what. So I tried method 1 as Bob described, and it appears to have come together nicely, I'm working my way through the test positions, and it's getting done a lot faster than my old "semi-alphabeta" search that I started out these threads with at the beginning of the week.xsadar wrote:Both methods work fine. Eventually you will probably prefer method 2 because it lets you do special things in the root position, but at this point, I think it's best for you to pick whichever method you find easiest to understand and deal with.
touch wood.
Anyways, I'll do a summary post once I get it all nailed down, so people can see the results of their assistance.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Interpreting Alpha-Beta pseudo-code
I am not yet sure what your exact problem was. Was it the question how you get the best move returned by your search function (AlphaBeta)? In this case you have several choices:
1) Stay with only one search function (ignoring quiescence search here), and let this function return the best move via an output parameter in addition to the integer return value representing the value of the subtree. The best move might only be used by the "external" caller (at "Level 0" as you called it) and will be ignored otherwise, but avoiding different cases makes the code easier to read.
2) Same as 1 but with an additional condition "if (we are at level 0)" around the update of the best move.
3) Have two slightly different versions of the search function (I think this is what you called "method 2"), one for "Level 0" and one for all other levels. In this case, only the "Level 0" function needs to track the best move. Everything else is essentially the same in both functions, at least for your first approach. Disadvantage: some more lines of code. Advantage: you avoid to execute a tiny bit of code that is not necessary below level 0 (update of best move when current move is an improvement), and you save stack space by not maintaining the best move at lower levels.
4) Similar to choice 1 again, but extended to a data structure representing the whole PV (principal variation) of the subtree, instead of only the best move. It is necessary to have something like this if you ever want to display the current PV to the user.
If you would ask me, the easiest way in the beginning would be choice 1 for me.
Sven
1) Stay with only one search function (ignoring quiescence search here), and let this function return the best move via an output parameter in addition to the integer return value representing the value of the subtree. The best move might only be used by the "external" caller (at "Level 0" as you called it) and will be ignored otherwise, but avoiding different cases makes the code easier to read.
2) Same as 1 but with an additional condition "if (we are at level 0)" around the update of the best move.
3) Have two slightly different versions of the search function (I think this is what you called "method 2"), one for "Level 0" and one for all other levels. In this case, only the "Level 0" function needs to track the best move. Everything else is essentially the same in both functions, at least for your first approach. Disadvantage: some more lines of code. Advantage: you avoid to execute a tiny bit of code that is not necessary below level 0 (update of best move when current move is an improvement), and you save stack space by not maintaining the best move at lower levels.
4) Similar to choice 1 again, but extended to a data structure representing the whole PV (principal variation) of the subtree, instead of only the best move. It is necessary to have something like this if you ever want to display the current PV to the user.
If you would ask me, the easiest way in the beginning would be choice 1 for me.
Sven
-
- Posts: 814
- Joined: Sat May 09, 2009 4:51 pm
- Location: Toronto
Re: Interpreting Alpha-Beta pseudo-code
Indeed, that was a big part of the problem. I haven't considered your choices in detail yet, but I will in due time. What I ended up doing was maintaining a move object at the class level, outside of my method1 search. Then, whenever alpha was updated or when a beta cutoff happened, both at depth==ply, I would update my external move object.Sven Schüle wrote:I am not yet sure what your exact problem was. Was it the question how you get the best move returned by your search function (AlphaBeta)? In this case you have several choices:
...
If you would ask me, the easiest way in the beginning would be choice 1 for me.
Sven
As it stands now, I have lost one thing in going to method1. It is entirely reasonable to expect that more than one level1 move would have the same eval, and I'd prefer to select randomly from those moves, for variety. Now that sort of thing seems much clearer with method2, but I'm sure it can't be too hard to do it under method1, I just haven't tried to work it out yet.
p.s. I intend to revisit method2 once I get method1 nailed down, and solve the problem. I'll let you know how it goes.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Interpreting Alpha-Beta pseudo-code
It is not a good idea to consider several (root) moves that return the same evaluation, and choose randomly between all of them. Here is the problem.Fguy64 wrote:As it stands now, I have lost one thing in going to method1. It is entirely reasonable to expect that more than one level1 move would have the same eval, and I'd prefer to select randomly from those moves, for variety. Now that sort of thing seems much clearer with method2, but I'm sure it can't be too hard to do it under method1, I just haven't tried to work it out yet.
Say you have moves A and B at the root which are searched in this order. Searching A with window (alpha, beta) returns value X which is the best value up to now. Now you search B with window (X, beta) and get X returned again, what does this mean?
The true value of B is "lower or equal than X", and it may happen frequently that it is lower than X but you will never get to know about it. As soon as the opponent has one response to B that evaluates at least to -X from his viewpoint, this causes a cutoff. The opponent might have a better response but this would be irrelevant, B is already refuted sufficiently so other responses are not even tried.
For this reason it is not suitable to compare the value of B, which may have been searched incompletely by intent due to a cutoff, to the value of A which has been searched completely since it has become the best move.
So you better save the effort and just track one best move, and only overwrite it if a new move improves the best value. Everything else might be counter-productive for playing strength.
Sven
-
- Posts: 814
- Joined: Sat May 09, 2009 4:51 pm
- Location: Toronto
Re: Interpreting Alpha-Beta pseudo-code
OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.Sven Schüle wrote:It is not a good idea to consider several (root) moves that return the same evaluation, and choose randomly between all of them. Here is the problem.Fguy64 wrote:As it stands now, I have lost one thing in going to method1. It is entirely reasonable to expect that more than one level1 move would have the same eval, and I'd prefer to select randomly from those moves, for variety. Now that sort of thing seems much clearer with method2, but I'm sure it can't be too hard to do it under method1, I just haven't tried to work it out yet.
Say you have moves A and B at the root which are searched in this order. Searching A with window (alpha, beta) returns value X which is the best value up to now. Now you search B with window (X, beta) and get X returned again, what does this mean?
The true value of B is "lower or equal than X", and it may happen frequently that it is lower than X but you will never get to know about it. As soon as the opponent has one response to B that evaluates at least to -X from his viewpoint, this causes a cutoff. The opponent might have a better response but this would be irrelevant, B is already refuted sufficiently so other responses are not even tried.
For this reason it is not suitable to compare the value of B, which may have been searched incompletely by intent due to a cutoff, to the value of A which has been searched completely since it has become the best move.
So you better save the effort and just track one best move, and only overwrite it if a new move improves the best value. Everything else might be counter-productive for playing strength.
Sven
It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.