Collecting the principal Variation

Fguy64 · Post by **Fguy64** » Sun Oct 18, 2009 8:00 pm

OK, I am trying to do collect the PV, and it is just not happening.

Just to set the stage, I am working with just alphabeta search, no qSearch, no iterative deepening, no hash table.
I use the terms maxPly to denote the initial search depth, and the term ply to denote the decreasing search depth at each call to alphaBeta, so ply goes from maxPlay to 0, at which point eval() is called. Finally, I am working with a single external non-recursive call to alphaBeta.

The PV moves are being stored in an array PV of length maxPly. Each time alpha is updated, a copy of the move being evaluated is saved in PV[maxPly-ply].

So anyways, the first move PV[0] is always the final move chosen, after that things are pretty much garbage. I imagine I am missing the point somewhere.

Any suggestions would be welcome.

Edmund · Post by **Edmund** » Sun Oct 18, 2009 8:29 pm

take a look at Bruce Morelands pages:
http://web.archive.org/web/200404270138 ... ing/pv.htm

the way you are doing it now requires to create a separate array for each PV-node. The reason for this is that you can never say that a certain line is a PV unless you have finished searching it. Just raising alpha doesn't yet mean you have found a new PV, so you are potentially overwriting the real PV.

bob · Post by **bob** » Sun Oct 18, 2009 9:02 pm

Fguy64 wrote:OK, I am trying to do collect the PV, and it is just not happening.

Just to set the stage, I am working with just alphabeta search, no qSearch, no iterative deepening, no hash table.
I use the terms maxPly to denote the initial search depth, and the term ply to denote the decreasing search depth at each call to alphaBeta, so ply goes from maxPlay to 0, at which point eval() is called. Finally, I am working with a single external non-recursive call to alphaBeta.

The PV moves are being stored in an array PV of length maxPly. Each time alpha is updated, a copy of the move being evaluated is saved in PV[maxPly-ply].

So anyways, the first move PV[0] is always the final move chosen, after that things are pretty much garbage. I imagine I am missing the point somewhere.

Any suggestions would be welcome.

Let's suppose you do a 5 ply search. When you get to ply=5, and change alpha, you need to store the current move in pv[5][5]. When you change alpha at ply=4, you need to copy best move to pv[4][4] and copy pv[5][5] to pv[4][5]. When you change alpha at ply=3, you copy current move to pv[3][3], and then copy pv[4][4&5] to pv[3][4&5].

When you get a new best move at ply 1, you complete this by putting current move in pv[1][1] and copying pv[2][2-5] to pv[1][2-5].

If you follow that, you are backing up the PV as you back up the scores. It works and has no measurable cost.

Fguy64 · Post by **Fguy64** » Sun Oct 18, 2009 9:39 pm

OK, I think I see where I was going wrong. The way I was doing it, I think the only way for me to get a proper PV is if it was the last line searched, in which case I might conclude that my move ordering needed work.

bob · Post by **bob** » Mon Oct 19, 2009 12:13 am

Fguy64 wrote:OK, I think I see where I was going wrong. The way I was doing it, I think the only way for me to get a proper PV is if it was the last line searched, in which case I might conclude that my move ordering needed work.

This is commonly called the "triangular array approach" since when you look at it, you use 1/2 of the total array. The entire first line is used. All but the first slot in the second, all but the first 2 in the third, etc...

Fguy64 · Post by **Fguy64** » Wed Oct 21, 2009 2:43 am

bob wrote:
Fguy64 wrote:OK, I am trying to do collect the PV, and it is just not happening.

Just to set the stage, I am working with just alphabeta search, no qSearch, no iterative deepening, no hash table.
I use the terms maxPly to denote the initial search depth, and the term ply to denote the decreasing search depth at each call to alphaBeta, so ply goes from maxPlay to 0, at which point eval() is called. Finally, I am working with a single external non-recursive call to alphaBeta.

The PV moves are being stored in an array PV of length maxPly. Each time alpha is updated, a copy of the move being evaluated is saved in PV[maxPly-ply].

So anyways, the first move PV[0] is always the final move chosen, after that things are pretty much garbage. I imagine I am missing the point somewhere.

Any suggestions would be welcome.
Let's suppose you do a 5 ply search. When you get to ply=5, and change alpha, you need to store the current move in pv[5][5]. When you change alpha at ply=4, you need to copy best move to pv[4][4] and copy pv[5][5] to pv[4][5]. When you change alpha at ply=3, you copy current move to pv[3][3], and then copy pv[4][4&5] to pv[3][4&5].

When you get a new best move at ply 1, you complete this by putting current move in pv[1][1] and copying pv[2][2-5] to pv[1][2-5].

If you follow that, you are backing up the PV as you back up the scores. It works and has no measurable cost.

Okay, I think it follows that if a move at any ply n < 5 results in an alpha update, then there will be PV moves to collect at each ply back to 5, but the update at ply = n does not mean there will be updates as we traverse back to ply = 1. And I think we can also say that if there is such a break, a lack of an update at any play on the way back to 1, then all the moves we collected so far on our way back will be discarded. And therefore if there is in update at ply=1, then there must be moves all the way back to 5 that need to be collected.

I think.

Fguy64 · Post by **Fguy64** » Sun Oct 25, 2009 3:54 am

Edmund wrote:take a look at Bruce Morelands pages:
http://web.archive.org/web/200404270138 ... ing/pv.htm

the way you are doing it now requires to create a separate array for each PV-node. The reason for this is that you can never say that a certain line is a PV unless you have finished searching it. Just raising alpha doesn't yet mean you have found a new PV, so you are potentially overwriting the real PV.

I'm having a hard time with this. OK I don't want to do a hash table yet, so I want to this approach. So am I right in saying that for each PV node I need an array big enough to store all the possible moves at a given node. Usually I just pick 100.

hgm · Post by **hgm** » Sun Oct 25, 2009 6:46 am

For each ply level in the tree you need such an array. Usually people define these as a global two-dimensional ("tri-diagonal") array PV[nodePly][movePly]. A node 5 ply from the root would then copy the best variation it finds to PV[5][5...N]. Move PV[5][5] would be its own best move, PV[5][6...N] would be copied from PV[6][6...N], which would be the best line from the daughter it just searched to obtain that move. So it would have to do the copying before it searches any other daughters, i.e. right at the time where it raises alpha.

The above assumes the arrays start counting at 0 for the root / root move. To not waste time on copying unused elements, N could be stored in an array N[nodePly] too, so that the PV of a node at level p (PV[p][x]) would run to N[p], and you set N[p] = N[p+1] at the time of copying. A stand-pat in QS would set N[p] = p-1.

An alternative implementation could make each row of the PV matrix PV[p][...] a local variable of the node at ply p, and would pass to the Search routine a pointer to this variable of the parent, so that the daughter (when it gets an in-window score) can copy its PV to the parent.

Fguy64 · Post by **Fguy64** » Sun Oct 25, 2009 2:42 pm

hgm wrote:For each ply level in the tree you need such an array. Usually people define these as a global two-dimensional ("tri-diagonal") array PV[nodePly][movePly]. A node 5 ply from the root would then copy the best variation it finds to PV[5][5...N]. Move PV[5][5] would be its own best move, PV[5][6...N] would be copied from PV[6][6...N], which would be the best line from the daughter it just searched to obtain that move. So it would have to do the copying before it searches any other daughters, i.e. right at the time where it raises alpha.

The above assumes the arrays start counting at 0 for the root / root move. To not waste time on copying unused elements, N could be stored in an array N[nodePly] too, so that the PV of a node at level p (PV[p][x]) would run to N[p], and you set N[p] = N[p+1] at the time of copying. A stand-pat in QS would set N[p] = p-1.

An alternative implementation could make each row of the PV matrix PV[p][...] a local variable of the node at ply p, and would pass to the Search routine a pointer to this variable of the parent, so that the daughter (when it gets an in-window score) can copy its PV to the parent.

OK, a two dimensional array of moves for each ply level then

Sven · Post by **Sven** » Sun Oct 25, 2009 7:24 pm

Fguy64 wrote:So am I right in saying that for each PV node I need an array big enough to store all the possible moves at a given node. Usually I just pick 100.

You don't store all the possible moves at a given node but the principal variation of a given node. And since you will probably never search 100 plies deep, I guess 100 is too high. Maybe you try an array of size [32][32].

This is how it looks like, with an array of size [4][4]:

Code: Select all

PV at ply 0&#58; xxx xxx xxx xxx
PV at ply 1&#58; --- xxx xxx xxx
PV at ply 2&#58; --- --- xxx xxx
PV at ply 3&#58; --- --- --- xxx

When detecting a new best move at ply N, the new PV at ply N consists of that new best move (written to entry [N][N]) and a full copy of the PV of ply N+1 (written to entries [N][N+1] .. [N][MaxN-1]). You will also need some "PV terminator" entry to indicate where a given PV ends (like a null byte for character strings), e.g. a null move.

EDIT: Please note, the meaning of my "N" is different from that in HGM's posting, his "N" is my "MaxN-1".

Sven

Collecting the principal Variation

Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation

Re: Collecting the principal Variation