Interpreting Alpha-Beta pseudo-code

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

plattyaj

Re: Interpreting Alpha-Beta pseudo-code

Post by plattyaj »

Fguy64 wrote:It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Obviously from a purely technical point, if moves A & B are truly equal then it doesn't matter which one we play. What your code (currently) does is choose a random move. Well, you can do that by assigning a very small random element to the evaluation and some engines do that - in fact, the Winboard protocol includes "random" as a line just for that!! I wouldn't recommend it though - you want to return the same thing each time if you can so you can find bugs and tweak things.

Of course it's often the fact that we really don't think they are equal but that just means we have to change the evaluation so they aren't.

Andy.
Fguy64
Posts: 814
Joined: Sat May 09, 2009 4:51 pm
Location: Toronto

Re: Interpreting Alpha-Beta pseudo-code

Post by Fguy64 »

plattyaj wrote:
Fguy64 wrote:It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Obviously from a purely technical point, if moves A & B are truly equal then it doesn't matter which one we play.
OK, I agree, to a point. But I'll add that if they are truly equal, then why use a system that always chooses the same one?
plattyaj wrote: What your code (currently) does is choose a random move. Well, you can do that by assigning a very small random element to the evaluation and some engines do that - in fact, the Winboard protocol includes "random" as a line just for that!! I wouldn't recommend it though - you want to return the same thing each time if you can so you can find bugs and tweak things.
True enough, I do understand your point, but IMO your comment about finding bugs begs the question. If you were able to get an accurate eval of each level1 move, (which apparently we can't). Then that resolves your point about bug tracking.
plattyaj wrote: Of course it's often the fact that we really don't think they are equal but that just means we have to change the evaluation so they aren't.

Andy.
If it was easy to do so, i'd treat them as equal and not force a decision based on move order, rather than look for ( artificial? ) ways to make them unequal.

Anyways, Your points are well taken. As far as I can see, we are on the same page when it comes to the way things work, maybe not on the same page on the way we'd like things to work. That I can live with.
User avatar
xsadar
Posts: 147
Joined: Wed Jun 06, 2007 10:01 am
Location: United States
Full name: Mike Leany

Re: Interpreting Alpha-Beta pseudo-code

Post by xsadar »

Fguy64 wrote:
Sven Schüle wrote:
Fguy64 wrote:As it stands now, I have lost one thing in going to method1. It is entirely reasonable to expect that more than one level1 move would have the same eval, and I'd prefer to select randomly from those moves, for variety. Now that sort of thing seems much clearer with method2, but I'm sure it can't be too hard to do it under method1, I just haven't tried to work it out yet.
It is not a good idea to consider several (root) moves that return the same evaluation, and choose randomly between all of them. Here is the problem.

Say you have moves A and B at the root which are searched in this order. Searching A with window (alpha, beta) returns value X which is the best value up to now. Now you search B with window (X, beta) and get X returned again, what does this mean?

The true value of B is "lower or equal than X", and it may happen frequently that it is lower than X but you will never get to know about it. As soon as the opponent has one response to B that evaluates at least to -X from his viewpoint, this causes a cutoff. The opponent might have a better response but this would be irrelevant, B is already refuted sufficiently so other responses are not even tried.

For this reason it is not suitable to compare the value of B, which may have been searched incompletely by intent due to a cutoff, to the value of A which has been searched completely since it has become the best move.

So you better save the effort and just track one best move, and only overwrite it if a new move improves the best value. Everything else might be counter-productive for playing strength.

Sven
OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.

It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Correct. If you want the strongest possible engine, you can't afford to worry about finding more than one possible best move. I wonder, is this the reason for you confusion in the other thread?
Fguy64
Posts: 814
Joined: Sat May 09, 2009 4:51 pm
Location: Toronto

Re: Interpreting Alpha-Beta pseudo-code

Post by Fguy64 »

xsadar wrote:
Fguy64 wrote:
Sven Schüle wrote:
Fguy64 wrote:As it stands now, I have lost one thing in going to method1. It is entirely reasonable to expect that more than one level1 move would have the same eval, and I'd prefer to select randomly from those moves, for variety. Now that sort of thing seems much clearer with method2, but I'm sure it can't be too hard to do it under method1, I just haven't tried to work it out yet.
It is not a good idea to consider several (root) moves that return the same evaluation, and choose randomly between all of them. Here is the problem.

Say you have moves A and B at the root which are searched in this order. Searching A with window (alpha, beta) returns value X which is the best value up to now. Now you search B with window (X, beta) and get X returned again, what does this mean?

The true value of B is "lower or equal than X", and it may happen frequently that it is lower than X but you will never get to know about it. As soon as the opponent has one response to B that evaluates at least to -X from his viewpoint, this causes a cutoff. The opponent might have a better response but this would be irrelevant, B is already refuted sufficiently so other responses are not even tried.

For this reason it is not suitable to compare the value of B, which may have been searched incompletely by intent due to a cutoff, to the value of A which has been searched completely since it has become the best move.

So you better save the effort and just track one best move, and only overwrite it if a new move improves the best value. Everything else might be counter-productive for playing strength.

Sven
OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.

It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Correct. If you want the strongest possible engine, you can't afford to worry about finding more than one possible best move. I wonder, is this the reason for you confusion in the other thread?
hmmm, I'm not sure which of several confusions you refer to. :) No, I do not think this was my main problem, although it may have contributed in the sense that it gave me a reason (invalid though that reason was) to use a more complicated method (method2).

I think my main problem in the other thread was simply an incorrect application of the algorithm. In the act of separating out the root search, perhaps somehow something wasn't being negated somewhere. So I ended up having no problem solving checkmates, but my choices based on material eval were out to lunch.

Anyways, I do intend to get both methods working. But first I need to come up with a better moveGen that incorporates promotion, so as to simplify the coding of aplhaBeta. Which I guess means separate move legality code for engine and GUI. As mentioned I really haven't made much attempt to separate the two.


p.s. on second thought you do have a point that may have cause the bad material choices. I guess I will find out later
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Interpreting Alpha-Beta pseudo-code

Post by Sven »

plattyaj wrote:
Fguy64 wrote:It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Obviously from a purely technical point, if moves A & B are truly equal then it doesn't matter which one we play. What your code (currently) does is choose a random move. Well, you can do that by assigning a very small random element to the evaluation and some engines do that - in fact, the Winboard protocol includes "random" as a line just for that!! I wouldn't recommend it though - you want to return the same thing each time if you can so you can find bugs and tweak things.

Of course it's often the fact that we really don't think they are equal but that just means we have to change the evaluation so they aren't.

Andy.
You did not get the point yet. If move A is searched and becomes the new best move with value X, and later on move B is searched and gets the same value X using the AlphaBeta algorithm, then everything we know is:

- the "true value" of A is == X, but
- the "true value" of B is <= X (and may be much, much lower).

So it is very dangerous to handle B as "equal to A". No serious engine would do this.

Sven
User avatar
xsadar
Posts: 147
Joined: Wed Jun 06, 2007 10:01 am
Location: United States
Full name: Mike Leany

Re: Interpreting Alpha-Beta pseudo-code

Post by xsadar »

Fguy64 wrote:
xsadar wrote:
Fguy64 wrote: OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.

It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Correct. If you want the strongest possible engine, you can't afford to worry about finding more than one possible best move. I wonder, is this the reason for you confusion in the other thread?
hmmm, I'm not sure which of several confusions you refer to. :)
. . . [snipped] . . .
I was referring to your confusion about why we would do beta cutoffs when there were still moves (specifically underpromotions) which could potentially lead to a higher score. It's because of that cutoff that we can't randomly select from moves at the root that return the same score (because the real score for all but the first may be much lower than what alpha-beta returns).
Fguy64
Posts: 814
Joined: Sat May 09, 2009 4:51 pm
Location: Toronto

Re: Interpreting Alpha-Beta pseudo-code

Post by Fguy64 »

Sven Schüle wrote:
plattyaj wrote:
Fguy64 wrote:It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Obviously from a purely technical point, if moves A & B are truly equal then it doesn't matter which one we play. What your code (currently) does is choose a random move. Well, you can do that by assigning a very small random element to the evaluation and some engines do that - in fact, the Winboard protocol includes "random" as a line just for that!! I wouldn't recommend it though - you want to return the same thing each time if you can so you can find bugs and tweak things.

Of course it's often the fact that we really don't think they are equal but that just means we have to change the evaluation so they aren't.

Andy.
You did not get the point yet. If move A is searched and becomes the new best move with value X, and later on move B is searched and gets the same value X using the AlphaBeta algorithm, then everything we know is:

- the "true value" of A is == X, but
- the "true value" of B is <= X (and may be much, much lower).

So it is very dangerous to handle B as "equal to A". No serious engine would do this.

Sven
I'm not sure if you are addressing me or Andy. but if I didn't get it before, I get it now.

My interpretation of Andy's equal remarks was that he was speaking in the "real chessic sense" of two moves being equal, as opposed to the "virtual sense" of the value being assigned by the algorithm. In other words, I didn't see a conflict between what Andy meant and what you meant. But I suppose we should wait and see what Andy meant.
Fguy64
Posts: 814
Joined: Sat May 09, 2009 4:51 pm
Location: Toronto

Re: Interpreting Alpha-Beta pseudo-code

Post by Fguy64 »

xsadar wrote:
Fguy64 wrote:
xsadar wrote:
Fguy64 wrote: OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.

It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Correct. If you want the strongest possible engine, you can't afford to worry about finding more than one possible best move. I wonder, is this the reason for you confusion in the other thread?
hmmm, I'm not sure which of several confusions you refer to. :)
. . . [snipped] . . .
I was referring to your confusion about why we would do beta cutoffs when there were still moves (specifically underpromotions) which could potentially lead to a higher score. It's because of that cutoff that we can't randomly select from moves at the root that return the same score (because the real score for all but the first may be much lower than what alpha-beta returns).
well, hmm ok . in alphaBeta search, at each node we iterate through a list of possible moves. I am beginning to see that when you try to create four possible moves ( one for each promotion) within the same iteration, then you apply a beta cutoff test after each promotion, you are indeed running that risk. So I still say I was right about cutting off a desireable underpromotion.

The solution, which I am now working on, is to remove promotion from search, as i describe in an earlier post in this thread. By doing so, you ensure that each iteration contains only one move, and not four in the case of promotion.

make sense?
plattyaj

Re: Interpreting Alpha-Beta pseudo-code

Post by plattyaj »

Fguy64 wrote:My interpretation of Andy's equal remarks was that he was speaking in the "real chessic sense" of two moves being equal, as opposed to the "virtual sense" of the value being assigned by the algorithm. In other words, I didn't see a conflict between what Andy meant and what you meant. But I suppose we should wait and see what Andy meant.
Yes I was. I literally meant two scores from the evaluation at the leaves. Backed up scores have the added "problem" of not necessarily being exact scores but being bounds (in most cases). There is no conflict.

Andy.
User avatar
xsadar
Posts: 147
Joined: Wed Jun 06, 2007 10:01 am
Location: United States
Full name: Mike Leany

Re: Interpreting Alpha-Beta pseudo-code

Post by xsadar »

Fguy64 wrote:
xsadar wrote:
Fguy64 wrote:
xsadar wrote:
Fguy64 wrote: OK I think I see what you mean. It sounds like the value that gets assigned after a cutoff can't be reliable? That would make sense.

It is starting to sound like a bit of a tradeoff. If you want the speed benefits of full alphabeta, then you have to give something up, and that is the ability to consider two moves at the same level as equal.
Correct. If you want the strongest possible engine, you can't afford to worry about finding more than one possible best move. I wonder, is this the reason for you confusion in the other thread?
hmmm, I'm not sure which of several confusions you refer to. :)
. . . [snipped] . . .
I was referring to your confusion about why we would do beta cutoffs when there were still moves (specifically underpromotions) which could potentially lead to a higher score. It's because of that cutoff that we can't randomly select from moves at the root that return the same score (because the real score for all but the first may be much lower than what alpha-beta returns).
well, hmm ok . in alphaBeta search, at each node we iterate through a list of possible moves. I am beginning to see that when you try to create four possible moves ( one for each promotion) within the same iteration, then you apply a beta cutoff test after each promotion, you are indeed running that risk. So I still say I was right about cutting off a desireable underpromotion.

The solution, which I am now working on, is to remove promotion from search, as i describe in an earlier post in this thread. By doing so, you ensure that each iteration contains only one move, and not four in the case of promotion.

make sense?
Nope, doesn't make sense. I agree that one move per iteration, and keeping promotion code out of the search function is a cleaner, better way of doing things. However, I don't know what risk you think you're running by having a beta cutoff test after each promotion. It sounds to me like the loop semantics are confusing you somehow.

If you have four moves (promotions or otherwise) within a single iteration of the loop, you have to do the same things -- including the beta cutoff test -- for each move, and not just for each iteration Otherwise it will not do the same thing as when you have only one move per iteration. And if there were any risk you were running by having a beta cutoff test after each promotion within one iteration, you would be running the same risk when each promotion has its own iteration.