I think drawing out the tree of Alpha-Beta calls will help.
Notation: Position "P" is the root. P1 is the 1st child of the root. P17 is the 7th child of the 1st-child of the root. F1 is the "future" (potentially evaluated in parallel) of Minimax(P1). Lets assume there are only 9-children (1-9) per node to simplify notation. P1754 is the 4th child of the 5th child of the 7th child of the 1st child of the root. F1754 is minimax(P1754).
Here's the example code for Negamax Alpha-Beta from Chessprogramming.org for reference. I've modified it slightly to make this post easier to follow... I don't care about depth, and Position p is explicitly added so we can talk about the various nodes.
FutureExpression is a tree of futures composed of Max() functions and Negate() functions. A FutureSymbol is a value that will be returned by a Spawn eventually.
Code: Select all
FutureExpression alphaBeta(Position p, FutureExpression alpha, FutureExpression beta) {
for ( all moves "M" of Position P) {
FutureExpression score = Spawn(-alphaBeta(P.M -beta, -alpha));
if( score >= beta )
return beta; // fail hard beta-cutoff
alpha = max(score, alpha);
}
return alpha;
}
F = AlphaBeta(P, -Inf, +Inf); // Spawns 9 children, P1 through P9.
What does AlphaBeta do inside of the root node?
Code: Select all
F1 = AlphaBeta(P1, -Inf, +Inf); // Child1 is a TypeP node.
// Beta-cutoff is impossible, because Beta = +Inf. All children can be evaluated in parallel.
F2 = AlphaBeta(P2, -Inf, [F1, Negate] ); // F1 is "lazy", do not "wait" for F1 unless absolutely necessary.
F3 = AlphaBeta(P3, -Inf, [F1, F2, Max, Negate] );
F4 = AlphaBeta(P4, -Inf, [F1, F2, Max, F3, Max, Negate] );
F5 = AlphaBeta(P5, -Inf, [F1, F2, Max, F3, Max, F4, Max, Negate] ); // This statement "lazy evaluates" -(Max(F4, Max(F3, Max(F2, F1)))).
...
F9 = AlphaBeta(P9, -Inf, [F1, F2, Max, F3, Max, F4, Max, ... F8, Max, Negate]);
Note that F itself = [F1, F2, Max, F3, Max, F4, Max, ... F8, Max, F9, Max]; You can evaluate a FutureExpression by imagining a stack, pushing values onto the stack. "Max" takes the max of the previous two elements on the stack. So [F1, F2, Max] becomes Max(F1, F2). [F1, F2, Max, F3, Max] becomes Max(F3, Max(F1, F2)). Etc. etc.
Okay, so nifty. We can evaluate different parts of the tree in parallel, as long as we construct this tree of F-statements to "track" how we get back. But how does Beta-cutoff work? All we gotta do is see the next "level" of the tree. Lets take F5 and see how F51 through F59 are generated.
Code: Select all
Remember: F5 = AlphaBeta(P5, -Inf, [F1, F2, Max, F3, Max, F4, Max, Negate] );
* Alpha = -Inf.
* Beta = [F1, F2, Max, F3, Max, F4, Max, Negate]
F51 = AlphaBeta(P51, [F1, F2, Max, F3, Max, F4, Max, Negate, Negate], +Inf);
// Potential Beta Cutoff! if(F51 >= [F1, F2, Max, F3, Max, F4, Max, Negate, Negate]), then BetaCutoff.
// We must block. We must wait for F51, F1, F2, F3, F4 before we can continue this path
// If F51 >= Beta is false, then F52 is evaluated as follows:
F52 = AlphaBeta(P52, [F1, F2, Max, F3, Max, F4, Max, Negate, Negate], F51);
F52 through F59 are all potentially going to block as well. But most importantly, they won't be evaluated if F52 will not spawn until the Beta-cutoff F51 >= [F1, F2, Max, F3, Max, F4, Max, Negate, Negate] is evaluated. This clearly leads to work-efficiency.
For good measure, lets go down the list one more time, and see how F51 is evaluated.
Code: Select all
F511 = AlphaBeta(P511, -Inf, +[F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate]);
// Hey, look at that. F511 has -Inf for Beta, so all F511 through F519 can be evaluated in parallel.
F512 = AlphaBeta(P512, -Inf, +[F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, Negate]);
F513 = AlphaBeta(P513, -Inf, +[F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, F512, Max, Negate]);
F514 = AlphaBeta(P513, -Inf, +[F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, F512, Max, F513, Max, Negate]);
...
F519 = AlphaBeta(P519, -Inf, +[F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, F512, Max, F513, Max, ... F518, Max, Negate]);
F51 = [F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, F512, Max, F513, Max, ... F518, Max, F519, Max]
We now have the FutureExpression for F51: [F1, F2, Max, F3, Max, F4, Max, Negate, Negate, Negate, F511, Max, F512, Max, F513, Max, ... F518, Max, F519, Max]. Representing Max (F519, Max(... Max(F512, Max(F511,- - -Max(F4, Max(F3, Max(F1, F2)))))
The triple-negate comes from the Negamax framework, negating each time a child is visited. We now see that the "Beta-Cutoff" blocking F52 will require F1, F2, F3, F5, F511, F512, F513... F519 to calculate. But we've generated a
LOT of nodes that can be processed in parallel still, so there's a lot of work the GPU can do.