Can principal variation search be worth no Elo?

algerbrex · Post by **algerbrex** » Fri Sep 24, 2021 9:15 pm

emadsen wrote: ↑Fri Sep 24, 2021 7:56 pm
algerbrex wrote: ↑Fri Sep 24, 2021 7:45 pm Right, that's what I've heard, but shouldn't PVS be a decent Elo gain even without LMR?
Perhaps some, but I wouldn't expect much.

In my opinion, PVS doesn't make any sense without LMR. The objective is to quickly prove no other move beats the best-known move, either from TT.bestMove or from move ordering. We do that through the combination of searching 1) a zero-width window at 2) reduced depth. You can experiment how much to reduce based on quiet move number. Perhaps reduce 1 at quiet move 3, reduce 2 at quiet move 7, reduce 3 at quiet move 13? Experiment with it. Of course, quiet moves must be ordered according to the history heuristic.

Hmm, that's interesting. Thanks. My understanding has always been that PVS and LMR, while they complimented each other nicely, weren't necessary to do together to get a noticeable gain. But if they're as intertwined as you imply, then that might help to explain why it isn't showing up as a gain for me.

I'll keep experimenting with what I have right now, but my plan going forward will be to get the dev branch up to speed with Blunder 6.0.0, minus the pruning and bugs, and then see if I can get a noticeable gain from PVS + LMR.

Also to address your post about doPVS, my understanding was that we should wait for the first PV Move, not just the first move before we start doing PVS. I've seen some people do it the way you suggest, and I can understand how that would work since hopefully your move ordering is so good that the first 2-3 are the best move, but that's just been my understanding.

mvanthoor · Post by **mvanthoor** » Fri Sep 24, 2021 10:21 pm

In the meanwhile, I also read the same search code. While I don't want to try and discredit the things you say, I do have some opinion to offer.

1. I've often seen this implementation of alpha/beta in pseudo-code (here for example) where the loop is broken on a beta-cutoff. In these implementations, often "best_score" and "best_move" are set, the alpha/beta sections set the flags, and right after the break, the transposition table is written. Very often, these are fail-soft alpha-beta methods.

Obviously, your option of writing the TT immediately and returning beta immediately also works, assuming at the end of alpha-beta, alpha is returned. That would be a fail-hard alpha-beta method. I converted my own alpha/beta method to fail-soft, in the hopes that it will work better with aspiration windows... AW's didn't work with the fail-hard method. (This was exactly as you describe it: write the TT immediately on a beta-cutoff and return beta, return alpha at the end of the function. I now implement the fail-soft method as shown in the link above. No difference in playing strength or result, but _maybe_ it'll work with aspiration windows...)

I agree, however, that it is confusing to set alpha = score, and then returning alpha after the break. It's clearer to EITHER follow your suggestion and write the TT and return beta immediately, OR do what I'm doing, which is setting "best_score = score" and "best_move = move" before doing the comparison to alpha and beta, and return "best_score" at the end of the function, creating a fail-soft method.

2. Didn't he make sure of this within the TT? There's a comparison to the set bounds in the TT-code.

3. DoPVS seems to be correct. Same version as was on the Bruce Moreland site:

Code: Select all

int AlphaBeta(int depth, int alpha, int beta)

{

    BOOL fFoundPv = FALSE;

 

    if (depth == 0)

        return Evaluate();

    GenerateLegalMoves();

    while (MovesLeft()) {

        MakeNextMove();

        if (fFoundPv) {
            val = -AlphaBeta(depth - 1, -alpha - 1, -alpha);

            if ((val > alpha) && (val < beta)) // Check for failure.

                val = -AlphaBeta(depth - 1, -beta, -alpha);

        } else

            val = -AlphaBeta(depth - 1, -beta, -alpha);

        UnmakeMove();

        if (val >= beta)

            return beta;

        if (val > alpha) {

            alpha = val;

            fFoundPv = TRUE;

        }

    }

    return alpha;

}

I use this same method. In my engine, it yields 54 Elo without LMR. It finds the first move that is a PV move, and then it switches to PVS trying to prove that all other moves are not PV-moves. If a PV move is found, it is re-searched full width, and if it's then still a PV move, the method switches back to PVS again, because alpha will (again) be raised during the re-search. This should work; it does so in my engine.

4. - 5. I cannot comment on this. I didn't implement LMR yet.

6. According to the snippet above from the Bruce Moreland site, this is correct; a move above alpha or below beta is a PV move, and as this is not expected (or wanted) during a PVS search, it must be researched. If a move beats beta, it is a beta cutoff; not a PV move, so it shouldn't be researched. If you do, I assume you're wasting time on an unnecsessary re-search. It _could_ be that this has to be changed to make LMR work. I don't know yet, because I haven't implemented LMR yet.

===

In short, as far as I can see, the alpha/beta method has no mistakes, but it is less clear than it could have been.

mvanthoor · Post by **mvanthoor** » Fri Sep 24, 2021 10:27 pm

emadsen wrote: ↑Fri Sep 24, 2021 7:56 pm Perhaps some, but I wouldn't expect much.

I actually tested it, and the result is +54 Elo for a version with PVS (with both having a TT).

In my opinion, PVS doesn't make any sense without LMR. The objective is to quickly prove no other move beats the best known move, either from TT.bestMove or from move ordering. We do that through the combination of searching 1) a zero-width window at 2) reduced depth. You can experiment how much to reduce based on quiet move number. Perhaps reduce 1 at quiet move 3, reduce 2 at quiet move 7, reduce 3 at quiet move 13? Experiment with it. Of course, quiet moves must be ordered according to the history heuristic.

To me, PVS makes sense even without LMR. You find your PV-move, and then very quickly try to prove that no other move is a better PV-move. LMR can improve on that, because when re-searching with a full window, you do the search over again; where LMR will thus waste less time.

algerbrex wrote: ↑Fri Sep 24, 2021 9:15 pm Also to address your post about doPVS, my understanding was that we should wait for the first PV Move, not just the first move before we start doing PVS. I've seen some people do it the way you suggest, and I can understand how that would work since hopefully your move ordering is so good that the first 2-3 are the best move, but that's just been my understanding.

Most of the time, the move from the TT is the PV-move and thus the best move, so if you order that first, the PV will be found on the very first move and you can then prove the entire move list to be not-PV... except of course, if it so happens that there is a move that is an even BETTER PV-move (a move that raises alpha even more, but does not equal or beat beta).

emadsen · Post by **emadsen** » Fri Sep 24, 2021 10:40 pm

algerbrex wrote: ↑Fri Sep 24, 2021 9:15 pm Also to address your post about doPVS, my understanding was that we should wait for the first PV Move, not just the first move before we start doing PVS. I've seen some people do it the way you suggest, and I can understand how that would work since hopefully your move ordering is so good that the first 2-3 are the best move, but that's just been my understanding.

If by "we should wait for the first PV Move" you mean wait for score > alpha, that happens rarely. In a PV node (where beta - alpha > 1), the first move is the PV move. There's nothing to wait for. Often an engine will find the same PV line searched at ply x is the PV line at ply x + 1. Or perhaps, when searching to ply 14, the first 10 moves are the name but a small improvement is found in the last 4 moves resulting in a slightly higher score of +10 centi-pawns over ply 13.

On the contrary, finding a score > alpha indicates your PV is wrong. So, in the above example when searching to ply 14, in the PV, a score > alpha was found at ply 11. However, you don't want to wait until ply 11 before doing PVS. You want to do PVS immediately at ply 1, 2, 3, everywhere. The PV at ply 1, 2, 3, ... 14 is likely identical to what was found the previous iteration when searching to ply 13. It just that you tack one more move on at the end.

Occasionally, a tactic is found when searching to ply x + 1 that was not visible when searching to ply x (due to the horizon effect and LMR) that refutes the entire PV. In this case, the opponent's tactical move causes a beta cutoff deep in the tree (the tactic is many moves out from the root, close to the horizon). This causes your move to fail low at the previous ply, which causes the opponent's move to fail high at ply x - 2, etc. This cascades all the way back to the root. What happens at the root depends whether you've implemented aspiration windows or not. With aspiration windows, your best move from the previous iteration may fail low and the aspiration window must be widened. With a -inf to +inf alpha beta root window, your best move from the previous iteration will be assigned a lower score than expected. Hopefully you find a better root move. If not, it may be too late and your engine now has to ward off your opponent's attack.

To be confident the opponent's tactic actually works, in PVS, you want to research zero-window, reduced late moves that score > alpha. You want to search to full depth to ensure you don't actually have a defense and the opponent's move really is a threat. If you blindly trust the reduced-depth beta cutoff you could wreck your PV (it's costly to re-search many nodes in the tree that now are refuted) for what is really a phantom threat.

emadsen · Post by **emadsen** » Fri Sep 24, 2021 11:10 pm

mvanthoor wrote: ↑Fri Sep 24, 2021 10:27 pm To me, PVS makes sense even without LMR. You find your PV-move, and then very quickly try to prove that no other move is a better PV-move.

In my experience, reducing search depth is a much greater contributor to playing strength than reducing the alpha / beta window. In an all-node, we expect legalMove 2, 3, 4, etc all to fail low. So who cares if beta was alpha + 100 and now is alpha + 1?

Code: Select all

Ply 1: expect fail low,  search  100 to  200, actual score 100
Ply 2: expect fail high, search -200 to -100, actual score -100
Ply 3: expect fail low,  search  100 to  200, actual score 100
etc

In the expected case, it doesn't matter (we never consult the beta = 200 value). In the unexpected cause (the PV is wrong), then yes, the width of the window matters. The search will attempt to resolve an exact score within a wide window (perhaps it finds a score of 125). This takes more time than simply allowing zero-window beta cutoffs to cascade back up to the weak fail-low move.

So a zero-window helps. But LMR helps more. The numerous fail-low nodes in the tree fail low quicker.

mvanthoor wrote: ↑Fri Sep 24, 2021 10:27 pm LMR can improve on that, because when re-searching with a full window, you do the search over again; where LMR will thus waste less time.

No, that's not it. The value proposition of PVS + LMR is that the benefit of more quickly proving late moves fail low (due to shallower searches) outweighs the cost of 1) re-searches with a full window to full depth and 2) blindness introduced by the shallower searches.

algerbrex · Post by **algerbrex** » Sat Sep 25, 2021 4:27 pm

mvanthoor wrote: ↑Fri Sep 24, 2021 10:21 pm ...

As a side note, although I'm still playing around with PVS, the bug fixes I've made to the TT and in some other places, thanks to you and Mergi have given Blunder quite a boost in playing strength.

The current dev version of Blunder, with the same level of features as Blunder 5.0.0, plus the bug fixes and no pruning, seems to be only slightly worse. Even tho Blunder 5.0.0 includes null-move pruning. The results so far from a 5000 game test I'm running look promising.

Code: Select all

Score of Blunder-dev vs Blunder 5.0.0: 1127 - 1162 - 561  [0.494] 2850
...      Blunder-dev playing White: 582 - 564 - 279  [0.506] 1425
...      Blunder-dev playing Black: 545 - 598 - 282  [0.481] 1425
...      White vs Black: 1180 - 1109 - 561  [0.512] 2850
Elo difference: -4.3 +/- 11.4, LOS: 23.2 %, DrawRatio: 19.7 %

Of course, it isn't done yet, but it seems to be a decent indicator that Blunder was missing out on a deal of strength due to bugs.

I'm excited to see how Blunder will perform now as I slowly add back in some of the other features. And I'm especially interested to see how it'll perform with CCRL testing since this whole time being tested it never had the opportunity to use an actually 256 MB hash table due to the UCI bug I mentioned earlier. So it's really been playing at a disadvantage, particularly at longer time controls.

mvanthoor · Post by **mvanthoor** » Sat Sep 25, 2021 5:02 pm

algerbrex wrote: ↑Sat Sep 25, 2021 4:27 pm As a side note, although I'm still playing around with PVS, the bug fixes I've made to the TT and in some other places, thanks to you and Mergi have given Blunder quite a boost in playing strength.

Good to hear.

edit: I misread that at first. I though "dev version worse than 5.0.0, and that's good?" Then I saw that your dev version doesn't include null move pruning, and Blunder 5.0.0 does. So you're doing the bug-fixing on a stripped Blunder 5, reaching Blunder 5's strength (+/- 11 Elo), without some of the features Blunder 5.0.0 already has?

That's great. If you then put back null move pruning, you could easily add another 100 Elo, which would put you on par with Blunder 6.0.0, _without_ having some of Blunder 6's features.

Looking through your code uncovered the missing minus sign in my own code. That fix alone added +20 Elo in self-play, but against other engines, it added up to +27 Elo. That's a nice boost for just adding one character.

Some time ago Rustic participated in the tournament ZaTour, run by the creator of Zahak. That was Rustic Alpha 3.1.112, which was one of the first versions that had a tapered evaluation. (Alpha 3.0.0 doesn't; 3.1.100 is the dev version, and it had 12 commits.) I estimate the current Rustic-dev, version 3.15.100, to be about 70-100 Elo stronger than the ZaTour version, without adding a single feature. It's all code cleanup, refactoring, a tweak in the TT, a tweak in the time management, and the missing minus sign.

This weekend I'll run a match between Rustic 3.1.112 and the dev version to see what the difference is. Rustic-dev is now performing at 2150 - 2210 Elo, depending on the engine it's playing against. If Rustic 4 could hit 2165 on CCRL Blitz, it would improve by 300 points, just by adding a tapered evaluation (~200-250 points) and code cleanup / refactoring / fixing (50-70 points). If paired against the "right" engines (i.e., not paired against engines that specifically exploit gaps in Rustic's current PST-only evaluation), it could even hit 2200 on CCRL Blitz.

Rustic 4 is not yet going to have any prunings, LMR, SEE, or whatever. It's still just the basics + tapered PST's. Originally, I had hoped to hit 2000 Elo with that, but if all goes well, I'm already far beyond that.

Currently I'm testing the last "fix" I wrote in June. (Doesn't seem to make a difference.) If that's done I'll retest Mergi's earlier mate adjustments using some positions, and a 2000 game match, and if it gives lower node counts if it at least loses no Elo, I'll merge that. Next step will be to write a tuner to create my own PST's (again) based on a bigger dataset, and see if I can squeeze a few more Elo out of the dev-version before releasing Rustic 4.

===

PS: In some places, Blunder looks remarkably like Rustic. I doubted for some time between Go and Rust, and chose the latter after I found out Go used a garbage collector and had no generics, which I wanted to use for the TT (so I could use it for search, perft, and pawn-hash with different data structures).

I don't mind people using code from Rustic, but maybe I should add a clause to the license: "Don't port the engine to Go."

In some spots you already seem halfway there

Not that I care, btw, or I wouldn't have a website explaining how to do it... (which, granted, does need to have a massive update...)

mvanthoor · Post by **mvanthoor** » Sat Sep 25, 2021 5:34 pm

algerbrex wrote: ↑Sat Sep 25, 2021 4:27 pm Of course, it isn't done yet, but it seems to be a decent indicator that Blunder was missing out on a deal of strength due to bugs.

Personally I don't mind the following things for engine improvement:

- Code refactoring (faster pick_move() function in the dev-version)
- Cleanup (remove redundant code)
- Tweaks (in my case, time mangement, TT from 4 buckets to 3)
- Resoluving inaccuracies (occasionally replacing the wrong entry in the TT)

I hate the fact that Rustic Alpha 2 and 3 missed out on 20-25 Elo because of a missing minus sign though. That isn't even a bug; more like a typo. Here I thought the TT was perfect... but it wasn't. One inaccuracy, and one bug / mistake. Sometimes I hate writing software. It's too error-prone to ever trust yourself

algerbrex · Post by **algerbrex** » Sat Sep 25, 2021 6:00 pm

mvanthoor wrote: ↑Sat Sep 25, 2021 5:02 pm
algerbrex wrote: ↑Sat Sep 25, 2021 4:27 pm As a side note, although I'm still playing around with PVS, the bug fixes I've made to the TT and in some other places, thanks to you and Mergi have given Blunder quite a boost in playing strength.
Good to hear.

edit: I misread that at first. I though "dev version worse than 5.0.0, and that's good?" Then I saw that your dev version doesn't include null move pruning, and Blunder 5.0.0 does. So you're doing the bug-fixing on a stripped Blunder 5, reaching Blunder 5's strength (+/- 11 Elo), without some of the features Blunder 5.0.0 already has?

That's great. If you then put back null move pruning, you could easily add another 100 Elo, which would put you on par with Blunder 6.0.0, _without_ having some of Blunder 6's features.

Exactly my thoughts. I was initially disappointed with how Blunder 5.0.0 scored, as I was aiming for at least 2100 Elo, especially with the NMP. So I'm excited to see how things turn out.

mvanthoor wrote: ↑Sat Sep 25, 2021 5:02 pm Looking through your code uncovered the missing minus sign in my own code. That fix alone added +20 Elo in self-play, but against other engines, it added up to +27 Elo. That's a nice boost for just adding one character.

That's awesome to hear

Some time ago Rustic participated in the tournament ZaTour, run by the creator of Zahak. That was Rustic Alpha 3.1.112, which was one of the first versions that had a tapered evaluation. (Alpha 3.0.0 doesn't; 3.1.100 is the dev version, and it had 12 commits.) I estimate the current Rustic-dev, version 3.15.100, to be about 70-100 Elo stronger than the ZaTour version, without adding a single feature. It's all code cleanup, refactoring, a tweak in the TT, a tweak in the time management, and the missing minus sign.

This weekend I'll run a match between Rustic 3.1.112 and the dev version to see what the difference is. Rustic-dev is now performing at 2150 - 2210 Elo, depending on the engine it's playing against. If Rustic 4 could hit 2165 on CCRL Blitz, it would improve by 300 points, just by adding a tapered evaluation (~200-250 points) and code cleanup / refactoring / fixing (50-70 points). If paired against the "right" engines (i.e., not paired against engines that specifically exploit gaps in Rustic's current PST-only evaluation), it could even hit 2200 on CCRL Blitz.

Rustic 4 is not yet going to have any prunings, LMR, SEE, or whatever. It's still just the basics + tapered PST's. Originally, I had hoped to hit 2000 Elo with that, but if all goes well, I'm already far beyond that.

That's awesome to hear, I'm excited to test Rustic 4 personally.

Although I don't mind pruning, in the same vein as you, I'd like the maximize the Elo I can get before trying any significant sort of pruning. My original goal was to hit 2100 Elo without any sort of pruning, and go from there, which might be doable now.

mvanthoor wrote: ↑Sat Sep 25, 2021 5:02 pm Currently I'm testing the last "fix" I wrote in June. (Doesn't seem to make a difference.) If that's done I'll retest Mergi's earlier mate adjustments using some positions, and a 2000 game match, and if it gives lower node counts if it at least loses no Elo, I'll merge that. Next step will be to write a tuner to create my own PST's (again) based on a bigger dataset, and see if I can squeeze a few more Elo out of the dev-version before releasing Rustic 4.

Indeed, I need to finish writing a tuner myself. I'd like to tune my own PSQT's before the next major release of Blunder. So once I finish cleaning up this dev version, I'll release 6.1.0 and then later down the line release 7.0.0.

mvanthoor wrote: ↑Sat Sep 25, 2021 5:02 pm PS: In some places, Blunder looks remarkably like Rustic. I doubted for some time between Go and Rust, and chose the latter after I found out Go used a garbage collector and had no generics, which I wanted to use for the TT (so I could use it for search, perft, and pawn-hash with different data structures).

I don't mind people using code from Rustic, but maybe I should add a clause to the license: "Don't port the engine to Go."

In some spots you already seem halfway there

Not that I care, btw, or I wouldn't have a website explaining how to do it... (which, granted, does need to have a massive update...)

Well, they say imitation is the greatest form of flattery

In all seriousness, I've to be conscientious about making sure my code was by and large original, especially when completely rewriting the engine between versions 4 and 5, and looking back over the code I'd like to think I've done that mostly. Of course, as I've mentioned before I've been inspired by a lot of different engine authors, so I know that it's inevitable that some places still closely resemble the structure of code found in other engines.

Looking back over Blunder's code now, I realize that I saw how you implemented MVV LVA a couple of months ago, and nicked the idea for Blunder, so I'll make sure to mention that in the next release.

If there's anywhere else where you feel I've copied Rustic to close I'd be happy to talk about them. They might not bother you, but I think they may bother me

mvanthoor · Post by **mvanthoor** » Sat Sep 25, 2021 8:25 pm

algerbrex wrote: ↑Sat Sep 25, 2021 6:00 pm That's awesome to hear, I'm excited to test Rustic 4 personally.

Thanks

I have three goals for Rustic:

1. Be as fast as possible
2. Be as strong as possible for every feature it has (= squeeze every feature for as much Elo as possible)
3. Document how Rustic is built and write a chess programming book / website.

The reason for 1 is quite clear; faster = stronger for a chess engine. The reason for 2 is that I have encountered _waaay_ too many engines that tout a huge feature list, while still being under 2500 Elo. I've been hanging around (chess) computers long enough to know what should be possible, so I set myself the goal of obtaining as much Elo per feature as possible. Goal 3 is because I think too much information is too sketchy; many sources build on other sources, often without rigorous testing, and they often copy incomplete or incorrect information.

Points 2 and 3 cost a lot of time. Testing, testing and more testing; often even implementing one thing twice or thrice, seeing which option is the fastest and thus the strongest, or testing two or more implementations of a feature to see which one of my sources is actually correct (or, at least, gives the most playing strength while still making sense).

Rustic has been in development for two years (with some pauses in between). I can understand people beside me losing interest in the engine, but I can at least say that it _is_ very strong for the featureset it has, and the beginnings of the website have been made. I created this as a _very_ long term hobby project. (At some point there will probably something like Picochess, and/or a GUI, using parts of Rustic as a backend, but that is quite far into the future.)

Indeed, I need to finish writing a tuner myself. I'd like to tune my own PSQT's before the next major release of Blunder. So once I finish cleaning up this dev version, I'll release 6.1.0 and then later down the line release 7.0.0.

Good. I intend to do the first tuning on Zurichess' quit.epd (large) version, and see if this improves or at least equals the current playing strength. (It should; I'm now using MinimalChess' tables, which were tuned on the small dataset provided by Zurichess.) After that, I think I'll try and generate my own data (self-play mode or something, or having Rustic dump quiet positions during play) to tune on an even bigger data set provided by Rustic itself.

Well, they say imitation is the greatest form of flattery

In all seriousness, I've to be conscientious about making sure my code was by and large original, especially when completely rewriting the engine between versions 4 and 5, and looking back over the code I'd like to think I've done that mostly. Of course, as I've mentioned before I've been inspired by a lot of different engine authors, so I know that it's inevitable that some places still closely resemble the structure of code found in other engines.

Looking back over Blunder's code now, I realize that I saw how you implemented MVV LVA a couple of months ago, and nicked the idea for Blunder, so I'll make sure to mention that in the next release.

Nah. Don't bother. It's a very common way to implement MVV-LVA. I've read tutorials that do it in a similar way, MaksimKorzh mentioned it in his video's, BlueFever uses a similar version in VICE; I just adapted it to make the MVV-LVA values smaller, and took care that capturing the king or landing on an empty square would be 0 adjustment; that way, you don't have to check for the king or the empty square, which makes the MVV-LVA routine a tiny bit faster.

[quoteIf there's anywhere else where you feel I've copied Rustic to close I'd be happy to talk about them. They might not bother you, but I think they may bother me

[/quote]

Your transposition table looks a lot like mine, minus the generics (so I can use it for perft, and pawn hash data as well). It's inevitable that some code looks like existing code, especially if authors communicate with one another; and there are only so many ways to implement something in a correct and readable way.

"Don't port Rustic to Go" was more of a tongue-in-cheek comment than anything else. I wouldn't mind if someone actually did that.

PS: I actually installed Go for the first time in my life. Maybe it'll come in handy somewhere in the future. I actually like Blunder and MinimalChess, and some of the other newer engines as well, in which I had a bit of influence in the development such as Loki and Zahak. The newer engines are often easier to compile and use than older engines of the same rating; many of the weaker engines in the CCRL list are old and no longer in development.

Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?