Can principal variation search be worth no Elo?

mvanthoor · Post by **mvanthoor** » Fri Oct 15, 2021 1:15 am

algerbrex wrote: ↑Fri Oct 15, 2021 1:02 am I agree

And yep that’s right. I basically copied your code, and since it had no pruning I removed that too…which may have contributed to the problem, time will tell after I start the debugging. I’m mostly just glad I have something I can work with now

So you're indeed going to transform the code back to what it was before, step by step to see where the mistake was, just as you said? Cool... most people wouldn't bother. Glad I could help (indirectly). I'm interested to know where the bug was, because I've read your code several times, and didn't see any strange things. (Just some implementation differences.)

By the way...

I used Blunder's code to fix a bug / typo (the missing minus sign for CHECKMATE), which gained ~20 Elo.
You used Rustic's code to get PVS working correctly, which gained you ~55 Elo.

You still owe me 35 Elo. Somehow.

(Sorry. Couldn't resist

)

mvanthoor · Post by **mvanthoor** » Fri Oct 15, 2021 1:28 am

algerbrex wrote: ↑Fri Oct 15, 2021 1:00 am On the plus side, I know I’m not crazy and PVS works and gives around 50-60 Elo, so at this point all I (theoretically) have to do is slowly change your code back to mine and see at what point I introduce a bug. I’ll probably start tonight and finish sometime this weekend since I’ll be testing every change.

Note that my alpha-beta is an (extended) version of the fail-soft version here:

https://www.ics.uci.edu/~eppstein/180a/990202b.html

The difference between fail-hard and fail-soft is this:

- with fail-hard, you return the actual alpha and beta scores.
- with fail-soft, you return the best_score. That is the reason why I don't do "return beta" in the bèta cutoff, but do a "break" to jump directly after the move loop. There, I don't do "return alpha", but just "return best_score".

If used on its very own (without PVS or pruning), fail-soft will get you more precise alpha/beta values in the TT, which make a (very small) difference in playing strength because you have a few more cuts. When you use PVS, that difference all but becomes unmeasurable.

The reason why I switched to this implementation is because it's more clear to me, because:

1. you don't need to return two different values; you just return "best_score" at the end of the function.
2. you don't need to have a hash-write just before "return beta"; you just set the flag to beta, and use the hash write directly after the move loop.
3. because of 2, you can just set the hash flag to "alpha" when starting out, and set it to "beta" in the beta-cutoff, or to "exact" when you find a PV move.
4. you can save the "best_score", and the current move that goes with it, is "best_move" at that point, which is logical. So you can just write "best_score" and "best_move" as found in the move-loop into the TT; and as said, there is just one TT write point.

On that page, the author shows fail-hard, fail-soft, and PVS as different versions, but you can use PVS with either fail-hard or fail-soft. I saw no difference in playing strength between fail-hard with PVS and fail-soft with PVS, as expected. The difference is the concept and how the code looks. As said, I find it clearer than the "normal" way. (Some people also nest the "if score > alpha" and "if score >= beta" if-statements, but I like the separate way as I did in Rustic better.)

Using fail-hard or fail-soft is more a matter of taste than anything else. (But that page I linked above is the clearest explanation I could find about the difference.)

Darn. I could have written this in a neater way and put it into the website / book already.

PS: Their "current" variable is the same as my "best_score".

algerbrex · Post by **algerbrex** » Fri Oct 15, 2021 1:52 am

mvanthoor wrote: ↑Fri Oct 15, 2021 1:15 am So you're indeed going to transform the code back to what it was before, step by step to see where the mistake was, just as you said? Cool... most people wouldn't bother. Glad I could help (indirectly). I'm interested to know where the bug was, because I've read your code several times, and didn't see any strange things. (Just some implementation differences.)

Yup. Rustic has very well-written, clean code, which is why I copied it. But (1) it would bother me not knowing exactly what caused this bug, especially since it's been something I've been pulling my hair out over for the past 2-ish months now. And (2), I'd like for the most part the code in Rustic to be original. Obviously I don't have a problem copying a snippet here or there, but I essentially rewrote your search routine, but in Go, which...is a little bit too much copying for me

But since you obviously did things correctly in your code and I did not, there's likely a part of Rustic's code that I will keep since it fixed this bug.

mvanthoor wrote: ↑Fri Oct 15, 2021 1:15 am By the way...

I used Blunder's code to fix a bug / typo (the missing minus sign for CHECKMATE), which gained ~20 Elo.
You used Rustic's code to get PVS working correctly, which gained you ~55 Elo.

You still owe me 35 Elo. Somehow.

(Sorry. Couldn't resist )

Well, hold on. I thought you mentioned you might nick my implementation of history heuristics? HH gave me about ~40-ish Elo, so that should more than cover my debits

algerbrex · Post by **algerbrex** » Fri Oct 15, 2021 2:01 am

mvanthoor wrote: ↑Fri Oct 15, 2021 1:28 am
algerbrex wrote: ↑Fri Oct 15, 2021 1:00 am On the plus side, I know I’m not crazy and PVS works and gives around 50-60 Elo, so at this point all I (theoretically) have to do is slowly change your code back to mine and see at what point I introduce a bug. I’ll probably start tonight and finish sometime this weekend since I’ll be testing every change.
Note that my alpha-beta is an (extended) version of the fail-soft version here:

https://www.ics.uci.edu/~eppstein/180a/990202b.html

The difference between fail-hard and fail-soft is this:

- with fail-hard, you return the actual alpha and beta scores.
- with fail-soft, you return the best_score. That is the reason why I don't do "return beta" in the bèta cutoff, but do a "break" to jump directly after the move loop. There, I don't do "return alpha", but just "return best_score".

If used on its very own (without PVS or pruning), fail-soft will get you more precise alpha/beta values in the TT, which make a (very small) difference in playing strength because you have a few more cuts. When you use PVS, that difference all but becomes unmeasurable.

The reason why I switched to this implementation is because it's more clear to me, because:

1. you don't need to return two different values; you just return "best_score" at the end of the function.
2. you don't need to have a hash-write just before "return beta"; you just set the flag to beta, and use the hash write directly after the move loop.
3. because of 2, you can just set the hash flag to "alpha" when starting out, and set it to "beta" in the beta-cutoff, or to "exact" when you find a PV move.
4. you can save the "best_score", and the current move that goes with it, is "best_move" at that point, which is logical. So you can just write "best_score" and "best_move" as found in the move-loop into the TT; and as said, there is just one TT write point.

On that page, the author shows fail-hard, fail-soft, and PVS as different versions, but you can use PVS with either fail-hard or fail-soft. I saw no difference in playing strength between fail-hard with PVS and fail-soft with PVS, as expected. The difference is the concept and how the code looks. As said, I find it clearer than the "normal" way. (Some people also nest the "if score > alpha" and "if score >= beta" if-statements, but I like the separate way as I did in Rustic better.)

Using fail-hard or fail-soft is more a matter of taste than anything else. (But that page I linked above is the clearest explanation I could find about the difference.)

Darn. I could have written this in a neater way and put it into the website / book already.

PS: Their "current" variable is the same as my "best_score".

Hmm, thanks. Fail-soft is something I've been considering switching Blunder over to, and the link you gave explains it pretty well. I'll consider that more after I figure out why PVS wasn't working because I lied slightly, and kept my alpha-beta implementation fail-hard. So that's one thing I don't have to switch back.

mvanthoor · Post by **mvanthoor** » Fri Oct 15, 2021 2:03 am

algerbrex wrote: ↑Fri Oct 15, 2021 1:52 am Yup. Rustic has very well-written, clean code, which is why I copied it.

Thanks. After I write the tuner and release version 4, I'm going to put some time in updating that website. If Rustic's code and documentation can help future engine writers (in whatever language), then one of the major goals of writing this engine has been achieved.

The other two goals are:
- having my own chess engine to use for playing and analysis, eventually even in my own GUI, someday
- having a well-designed, well-documented engine (as per my definitions; opinions may differ here) with clean code that I can just port to another language knowing it will work, if I ever feel the need to write another engine. (Except if I decide to try and use something like Haskell or Lisp; then it probably won't work, but I'd probably rather jump off a bridge first before I use such slow languages to write a chess engine.)

But (1) it would bother me not knowing exactly what caused this bug, especially since it's been something I've been pulling my hair out over for the past 2-ish months now. And (2), I'd like for the most part the code in Rustic to be original. Obviously I don't have a problem copying a snippet here or there, but I essentially rewrote your search routine, but in Go, which...is a little bit too much copying for me But since you obviously did things correctly in your code and I did not, there's likely a part of Rustic's code that I will keep since it fixed this bug.

I also hate the "darn, it now works, but I don't know why" moments. And obviously, after you fix the bug, you are either NOT doing something you're doing now, or the other way around... which will make your code more similar to mine. (But Rustic's alpha/beta function function isn't something special either, obviously. This algorithm has been known for something like 60 years or so.)

Well, hold on. I thought you mentioned you might nick my implementation of history heuristics? HH gave me about ~40-ish Elo, so that should more than cover my debits

Oh, yeah. Right. I'll look into that for version 6. (Where 5 will probably be the null-move version.)

algerbrex · Post by **algerbrex** » Fri Oct 15, 2021 2:48 am

mvanthoor wrote: ↑Thu Oct 14, 2021 10:50 pm So no such thing. It doesn't matter if you skip the search in the move loop as I do, or exit the function in the beginning as you do.

Well, don't speak too soon

I decide to change the way I detected draws by moving the code back to the top of my negamax function, instead of in the move loop like you do it. And I won't speak too soon, but that might've just been the issue...

Edit: Eh, nevermind.

algerbrex · Post by **algerbrex** » Fri Oct 15, 2021 4:31 pm

Update.

The results are a tad preliminary right now, but it looks like the bug was somewhere in the code where I scored my moves. When I copy and paste in my code for scoring moves, PVS becomes a loss. When I replace my code with the code I copied from Rustic, PVS becomes a win again.

From a quick test of thousand with my move scoring code (tc=8+0.08s):

Code: Select all

Score of Blunder_pvs vs Blunder_no_pvs: 359 - 448 - 193  [0.456] 1000
...      Blunder_pvs playing White: 189 - 216 - 95  [0.473] 500
...      Blunder_pvs playing Black: 170 - 232 - 98  [0.438] 500
...      White vs Black: 421 - 386 - 193  [0.517] 1000
Elo difference: -31.0 +/- 19.4, LOS: 0.1 %, DrawRatio: 19.3 %

From another test still ongoing, same time control and exact same code, but using Rustic's move scoring algorithm:

Code: Select all

Score of Blunder_pvs vs Blunder_no_pvs: 268 - 142 - 90  [0.626] 500
...      Blunder_pvs playing White: 129 - 76 - 45  [0.606] 250
...      Blunder_pvs playing Black: 139 - 66 - 45  [0.646] 250
...      White vs Black: 195 - 215 - 90  [0.480] 500
Elo difference: 89.5 +/- 28.3, LOS: 100.0 %, DrawRatio: 18.0 %

Although the second test hasn't finished, I think it's fair to say that it's clear the version with Rustic's move scoring code and PVS is a win. So now the challenge will be to figure out where I went wrong scoring moves in my original code, even though I'm sure I spent a good bit of time scanning over my scoring code weeks ago

At least the location of the bug makes sense though. Since PVS is highly dependent on good move ordering, buggy move ordering of course would mean PVS would waste time doing expensive researches and overall be poorer than just plain alpha-beta.

I'll be running at least two more 2000 game tests for each version to be positive move scoring was the issue.

emadsen · Post by **emadsen** » Fri Oct 15, 2021 5:54 pm

algerbrex wrote: ↑Fri Oct 15, 2021 4:31 pm The results are a tad preliminary right now, but it looks like the bug was somewhere in the code where I scored my moves. When I copy and paste in my code for scoring moves, PVS becomes a loss. When I replace my code with the code I copied from Rustic, PVS becomes a win again.

Yeah, I thought your move scoring technique was suspect.

emadsen wrote: ↑Thu Sep 30, 2021 2:09 am
algerbrex wrote: ↑Thu Sep 30, 2021 1:33 am The latter method since it's more efficient. And currently, I score my moves by making sure the TT move always has the highest score, then the MVV-LVA values, and then finally the killer scores are always below the MVV-LVA values. I think the exact values are something like 60, 56-11, and 10.
Is 60 enough range to ensure proper sort order of moves? Is there a reason you're calculating an integer value for move priority versus using bit-shifting? Bit-shifting seems less prone to move priority bugs.

Hopefully, you've found the bug.

I suggest you write a ListMoves method that displays legal moves, along with their relevant properties, in the order in which they'll be searched. Will help you when you implement LMR and history heuristics. Gives you a quick sanity check to verify your engine is searching quality moves early.

mvanthoor · Post by **mvanthoor** » Fri Oct 15, 2021 6:41 pm

algerbrex wrote: ↑Fri Oct 15, 2021 4:31 pm Update.

...

Although the second test hasn't finished, I think it's fair to say that it's clear the version with Rustic's move scoring code and PVS is a win. So now the challenge will be to figure out where I went wrong scoring moves in my original code, even though I'm sure I spent a good bit of time scanning over my scoring code weeks ago

Good to know you've narrowed down the bug to a single function

Rustic scores it's moves over (almost) the full range of an unsigned 16 bit int, so there should be enough room for heuristics to score below the killers. (That is what the MVV_LVA_OFFSET does: it shifts all the MVV_LVA values upwards, scores the TT-move right above that, and the killers right below.)

algerbrex · Post by **algerbrex** » Fri Oct 15, 2021 8:13 pm

emadsen wrote: ↑Fri Oct 15, 2021 5:54 pm
algerbrex wrote: ↑Fri Oct 15, 2021 4:31 pm The results are a tad preliminary right now, but it looks like the bug was somewhere in the code where I scored my moves. When I copy and paste in my code for scoring moves, PVS becomes a loss. When I replace my code with the code I copied from Rustic, PVS becomes a win again.
Yeah, I thought your move scoring technique was suspect.

emadsen wrote: ↑Thu Sep 30, 2021 2:09 am
algerbrex wrote: ↑Thu Sep 30, 2021 1:33 am The latter method since it's more efficient. And currently, I score my moves by making sure the TT move always has the highest score, then the MVV-LVA values, and then finally the killer scores are always below the MVV-LVA values. I think the exact values are something like 60, 56-11, and 10.
Is 60 enough range to ensure proper sort order of moves? Is there a reason you're calculating an integer value for move priority versus using bit-shifting? Bit-shifting seems less prone to move priority bugs.
Hopefully, you've found the bug.

I suggest you write a ListMoves method that displays legal moves, along with their relevant properties, in the order in which they'll be searched. Will help you when you implement LMR and history heuristics. Gives you a quick sanity check to verify your engine is searching quality moves early.

Thanks, Erik, apparently you were right. I'll look into implementing some sort of method to make sure my move scoring is working.

P.S: I'm still open to using bit-shifting if its more robust, but I'm not at all familiar with it, so I'd need to look into that.

Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?

Re: Can principal variation search be worth no Elo?