Collecting Principal variation

bob · Post by **bob** » Wed Apr 01, 2015 4:38 am

Daniel Shawul wrote:
Sven Schüle wrote:
Daniel Shawul wrote:
Sven Schüle wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Yes. Furthermore, a PV is also not just one move but a sequence of moves. This already implies that two dimensions are required, regardless which search algorithm is used.
I thought you guys understood that negascout requires just pv[MAX_PLY] as the OP stated, when we had this discussion some years ago http://talkchess.com/forum/viewtopic.ph ... triangular. The only thing problematic were search instabilties as HG pointed out. Try the tscp-negascout program i provided with both the pv[MAX_PLY] and pv[MAX_PLY][MAX_PLY] approach, and you will see both gives the same result.
That discussion you link to did not have the result you make us believe today. It had the result that in a program as simple as TSCP (no TT, no search instability) it worked for you to replace the triangular PV approach by a linear approach, shown by you by playing few games. But in that same thread you also wrote:
Daniel Shawul wrote:Howver scorpio does get different pvs so there must be exceptions.
There was no agreement to your statement that a linear approach should be sufficient with negascout apart from agreeing that it seems to work in a modified TSCP. Also from a theoretical viewpoint Bob and myself disagreed to you, same as today. Therefore I see no point in asking why "we guys" did not understand that negascout would require a linear PV implementation only. There was no proof, nothing that we agreed upon.

It is obvious, and has also been explained at least by Bob back then, that a re-search of a move after failing high in a zero window can lead to a new *local* PV of the current node that has to replace the previous *local* PV. In this case you can definitely overwrite information that should not be overwritten, or get inconsistent PV results, if you do not keep one PV per ply.
It is 2015 now Sven and still counting ... Break the tscp-negascout code if you can so that pv[N] and pv[N][N] give different results.
Well I just felt compelled to reply when I see a newbie being fed wrong information...

It IS 2015. And nobody is using a simple no hash, no reduction, no pruning program today. If they start on the wrong path, they will have to backtrack. May as well do it the way that works for any approach, not just works for one simplistic approach.

jwes · Post by **jwes** » Thu Apr 02, 2015 2:46 am

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.

Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?

bob · Post by **bob** » Thu Apr 02, 2015 4:51 am

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?

That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.

jwes · Post by **jwes** » Thu Apr 02, 2015 9:10 pm

bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.

You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.

Sven · Post by **Sven** » Fri Apr 03, 2015 12:08 am

jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.

I don't follow. You have searched the subtree of root move e4, so your root PV starts with e4. Now you are searching root move d4 with a zero window. Opponent has, say, two replies: d5 and Nf6. Both fail low for black, therefore d4 fails high for white. You re-search d4 with an open window. No change to the PV so far. Now d5 delivers a score between alpha and beta. So the one-dimensional PV is now changed, "expecting" that the current root move d4 followed by d5 would become the new PV. Now PV[0]=e4, PV[1]=d5 (already inconsistent). But then the second black reply, Nf6, fails high and refutes d4. Who would recreate the previous (and still desired) PV belonging to 1.e4? Nobody, the root PV is now broken.

The point is that 1...d5 is a local PV of the node after 1.d4. It is not known whether it will become part of the new root PV until the whole subtree of 1.d4 is completed and its score is backed up to the root to compare it to the current score of the root PV (1.e4). Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move, or it will be skipped as a whole. But it can't be copied over the corresponding part of the root PV until you know it is correct to do so, i.e. only the root node may modify the root PV, so you need to remember this local PV as a candidate for the PV of the node one level above. And since the same problem applies to all lower levels as well, so the problem is recursive, you need to keep one PV per ply.

Sven · Post by **Sven** » Fri Apr 03, 2015 12:27 am

Sven Schüle wrote:Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move

Edit now impossible but the second occurrence of "1...d5" in the phrase above should have been "1.d4", of course.

jwes · Post by **jwes** » Fri Apr 03, 2015 2:32 am

Sven Schüle wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:
bob wrote:
jwes wrote:Using PVS, once a value between alpha and beta is returned the rest of the tree is searched with a zero window unless and until it is proved there is a better score. I believe this means the PV will always be the last move at each ply with a score between alpha and beta.
Problem is there are LOTS of times where something trips the PVS window and requires a re-search. But that entire path never becomes a new PV. There really isn't a short cut to having a PV per ply.
Doesn't this mean that we haven't found the PV yet? This path may not become the PV, but the previous path is also not the PV.
You don't know, however. That is the problem. The only way you really know you have a new pv is if it is backed up to the root along with the score it produced.
Under what circumstances can you have an evaluation return a value between alpha and beta and have some line that was searched before become the PV without re-searching it?
That's the problem. You can get fail highs BELOW the root position, which will change the PV array but as you back up you discover there is another way to refute that branch. But you have already clobbered the old PV. This is a product of the frequent search instabilities everyone sees with LMR/LMP/null-move search.
You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
I don't follow. You have searched the subtree of root move e4, so your root PV starts with e4. Now you are searching root move d4 with a zero window. Opponent has, say, two replies: d5 and Nf6. Both fail low for black, therefore d4 fails high for white. You re-search d4 with an open window. No change to the PV so far. Now d5 delivers a score between alpha and beta. So the one-dimensional PV is now changed, "expecting" that the current root move d4 followed by d5 would become the new PV. Now PV[0]=e4, PV[1]=d5 (already inconsistent). But then the second black reply, Nf6, fails high and refutes d4. Who would recreate the previous (and still desired) PV belonging to 1.e4? Nobody, the root PV is now broken.

The point is that 1...d5 is a local PV of the node after 1.d4. It is not known whether it will become part of the new root PV until the whole subtree of 1.d4 is completed and its score is backed up to the root to compare it to the current score of the root PV (1.e4). Either the whole local PV starting with 1...d5 will become new root PV, concatenated with 1...d5 as starting move, or it will be skipped as a whole. But it can't be copied over the corresponding part of the root PV until you know it is correct to do so, i.e. only the root node may modify the root PV, so you need to remember this local PV as a candidate for the PV of the node one level above. And since the same problem applies to all lower levels as well, so the problem is recursive, you need to keep one PV per ply.

The problem at the root is why you need two arrays, one for the current PV and one for a possible new PV. Below the root, when you have a fail-high followed by a fail-low, you re-search all the moves which will regenerate the PV.

syzygy · Post by **syzygy** » Fri Apr 03, 2015 3:11 am

jwes wrote:You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.

Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.

bob · Post by **bob** » Fri Apr 03, 2015 4:01 am

syzygy wrote:
jwes wrote:You get a fail high below the root, so you open the window and re-search the move, and it fails low while clobbering the PV. But then you re-search the PV move which recreates the PV. You might need to do something to keep a hash hit from truncating the PV.
Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.

probably the best idea would be to try to do this for the search value without having a new local variable for each ply of search...

Then the "why" stands out clearly.

hgm · Post by **hgm** » Fri Apr 03, 2015 9:30 am

syzygy wrote:Why would you re-search the PV move that you have already completely searched before? Seems a bit expensive just for reconstructing the PV that got clobbered only because you don't use a triangular PV.

I think the original drive for investigating this method was not just for keeping the PV moves, but for the entire state of the PV nodes. That would include the sorted move lists, possibly sub-tree sizes, etc., basically everything you would also keep track of in the root node. That can easily amount to 1KB per entry, and would make the cost of copying from candidate PV to actual PV at each level slightly more than trivial.

But, as remarked, the method is not robust against search instability.

Besides, there is an alternative solution that is robust. Which is storing the PV-node infos as a linked list, rather than a triangular array. Each level would only have to keep track of the first and the last cell of its candidate PV, and backs up this info to its parent in case it returns with a PV score. The parent then frees his previous own candidate PV by inserting it in front of the free list, allocates one new cell for the current move, and appends the PV it received from its daughter to that, to create its new PV candidate. When an open-window node fails high, its candiate PV goes back to the free list.

No need to ever copy any of the actual info; just pass the two pointers/cell numbers around. This would even be faser if the info was just the moves. (But would require double the storage in that case, for the cell links. And even when copying all the moves, the required time would still be immesurably small, because it happens so infrequently.))

Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation

Re: Collecting Principal variation