Collecting Principal variation

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Collecting Principal variation

Post by bob »

Daniel Shawul wrote:
Sven Schüle wrote:
Daniel Shawul wrote:
Sven Schüle wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Yes. Furthermore, a PV is also not just one move but a sequence of moves. This already implies that two dimensions are required, regardless which search algorithm is used.
I thought you guys understood that negascout requires just pv[MAX_PLY] as the OP stated, when we had this discussion some years ago http://talkchess.com/forum/viewtopic.ph ... triangular. The only thing problematic were search instabilties as HG pointed out. Try the tscp-negascout program i provided with both the pv[MAX_PLY] and pv[MAX_PLY][MAX_PLY] approach, and you will see both gives the same result.
That discussion you link to did not have the result you make us believe today. It had the result that in a program as simple as TSCP (no TT, no search instability) it worked for you to replace the triangular PV approach by a linear approach, shown by you by playing few games. But in that same thread you also wrote:
Daniel Shawul wrote:Howver scorpio does get different pvs so there must be exceptions.
There was no agreement to your statement that a linear approach should be sufficient with negascout apart from agreeing that it seems to work in a modified TSCP. Also from a theoretical viewpoint Bob and myself disagreed to you, same as today. Therefore I see no point in asking why "we guys" did not understand that negascout would require a linear PV implementation only. There was no proof, nothing that we agreed upon.

It is obvious, and has also been explained at least by Bob back then, that a re-search of a move after failing high in a zero window can lead to a new *local* PV of the current node that has to replace the previous *local* PV. In this case you can definitely overwrite information that should not be overwritten, or get inconsistent PV results, if you do not keep one PV per ply.
It is 2015 now Sven and still counting ... Break the tscp-negascout code if you can so that pv[N] and pv[N][N] give different results.
Well I just felt compelled to reply when I see a newbie being fed wrong information...
It IS 2015. And nobody is using a simple no hash, no reduction, no pruning program today. If they start on the wrong path, they will have to backtrack. May as well do it the way that works for any approach, not just works for one simplistic approach.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Collecting Principal variation

Post by jwes »

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Collecting Principal variation

Post by bob »

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Collecting Principal variation

Post by jwes »

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Collecting Principal variation

Post by Sven »

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
I don't follow. You have searched the subtree of root move e4, so your root PV starts with e4. Now you are searching root move d4 with a zero window. Opponent has, say, two replies: d5 and Nf6. Both fail low for black, therefore d4 fails high for white. You re-search d4 with an open window. No change to the PV so far. Now d5 delivers a score between alpha and beta. So the one-dimensional PV is now changed, "expecting" that the current root move d4 followed by d5 would become the new PV. Now PV[0]=e4, PV[1]=d5 (already inconsistent). But then the second black reply, Nf6, fails high and refutes d4. Who would recreate the previous (and still desired) PV belonging to 1.e4? Nobody, the root PV is now broken.

The point is that 1...d5 is a local PV of the node after 1.d4. It is not known whether it will become part of the new root PV until the whole subtree of 1.d4 is completed and its score is backed up to the root to compare it to the current score of the root PV (1.e4). Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move, or it will be skipped as a whole. But it can't be copied over the corresponding part of the root PV until you know it is correct to do so, i.e. only the root node may modify the root PV, so you need to remember this local PV as a candidate for the PV of the node one level above. And since the same problem applies to all lower levels as well, so the problem is recursive, you need to keep one PV per ply.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Collecting Principal variation

Post by Sven »

Sven Schüle wrote:Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move
Edit now impossible but the second occurrence of "1...d5" in the phrase above should have been "1.d4", of course.
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: Collecting Principal variation

Post by jwes »

Sven Schüle wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
I don't follow. You have searched the subtree of root move e4, so your root PV starts with e4. Now you are searching root move d4 with a zero window. Opponent has, say, two replies: d5 and Nf6. Both fail low for black, therefore d4 fails high for white. You re-search d4 with an open window. No change to the PV so far. Now d5 delivers a score between alpha and beta. So the one-dimensional PV is now changed, "expecting" that the current root move d4 followed by d5 would become the new PV. Now PV[0]=e4, PV[1]=d5 (already inconsistent). But then the second black reply, Nf6, fails high and refutes d4. Who would recreate the previous (and still desired) PV belonging to 1.e4? Nobody, the root PV is now broken.

The point is that 1...d5 is a local PV of the node after 1.d4. It is not known whether it will become part of the new root PV until the whole subtree of 1.d4 is completed and its score is backed up to the root to compare it to the current score of the root PV (1.e4). Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move, or it will be skipped as a whole. But it can't be copied over the corresponding part of the root PV until you know it is correct to do so, i.e. only the root node may modify the root PV, so you need to remember this local PV as a candidate for the PV of the node one level above. And since the same problem applies to all lower levels as well, so the problem is recursive, you need to keep one PV per ply.
The problem at the root is why you need two arrays, one for the current PV and one for a possible new PV. Below the root, when you have a fail-high followed by a fail-low, you re-search all the moves which will regenerate the PV.
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: Collecting Principal variation

Post by syzygy »

jwes wrote:You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Collecting Principal variation

Post by bob »

syzygy wrote:
jwes wrote:You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.
probably the best idea would be to try to do this for the search value without having a new local variable for each ply of search...

Then the "why" stands out clearly.
User avatar
hgm
Posts: 27869
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Collecting Principal variation

Post by hgm »

syzygy wrote:Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.
I think the original drive for investigating this method was not just for keeping the PV moves, but for the entire state of the PV nodes. That would include the sorted move lists, possibly sub-tree sizes, etc., basically everything you would also keep track of in the root node. That can easily amount to 1KB per entry, and would make the cost of copying from candidate PV to actual PV at each level slightly more than trivial.

But, as remarked, the method is not robust against search instability.

Besides, there is an alternative solution that is robust. Which is storing the PV-node infos as a linked list, rather than a triangular array. Each level would only have to keep track of the first and the last cell of its candidate PV, and backs up this info to its parent in case it returns with a PV score. The parent then frees his previous own candidate PV by inserting it in front of the free list, allocates one new cell for the current move, and appends the PV it received from its daughter to that, to create its new PV candidate. When an open-window node fails high, its candiate PV goes back to the free list.

No need to ever copy any of the actual info; just pass the two pointers/cell numbers around. This would even be faser if the info was just the moves. (But would require double the storage in that case, for the cell links. And even when copying all the moves, the required time would still be immesurably small, because it happens so infrequently.))