Progress on Blunder

algerbrex · Post by **algerbrex** » Thu Dec 02, 2021 11:26 pm

For the purposes of documentation, release announcements, and progress reports, I've started a thread here for Blunder. I'll also use this to post general updates and ideas I'm currently experimenting with. There may be a tad bit of rambling here or there if I get particularly focused on a certain idea, but I promise I'll keep that to a minimum

mvanthoor · Post by **mvanthoor** » Fri Dec 03, 2021 11:18 am

algerbrex wrote: ↑Thu Dec 02, 2021 11:26 pm For the purposes of documentation, release announcements, and progress reports, I've started a thread here for Blunder.

Blunder is already over 2400. You're done already.

(Now I'll have to find out what features Blunder has and try to reach 2400 with one less. Could you do me a favor and implement some useless stuff? WTH did you implement between version 6.1.0 and 7.1.0 for a 275 point Elo increase

With my calculations, I expect Rustic 4 to be ~2160, and Rustic 5 to hit 2300; maybe. And from there on, I'll have to see what to implement to get the largest gain.)

algerbrex · Post by **algerbrex** » Fri Dec 03, 2021 2:39 pm

mvanthoor wrote: ↑Fri Dec 03, 2021 11:18 am Blunder is already over 2400. You're done already.

Not quite. I at least have to hit 2600 first

mvanthoor wrote: ↑Fri Dec 03, 2021 11:18 am (Now I'll have to find out what features Blunder has and try to reach 2400 with one less. Could you do me a favor and implement some useless stuff?

Well if you go look through my recent commits on Blunder's development branch, you'll see I'm already starting to adopt the strategy of adding in some pretty useless features, after further testing

Like I recently thought I got a nice increase against version 7.2.0 by adding in king-safety, pawn structure terms, and retuning all of Blunder's evaluation parameters, but regression testing shows it's maybe ~15 Elo stronger. I'm keeping it in for now though since it helps to make the play a little more aesthetically pleasing.

mvanthoor wrote: ↑Fri Dec 03, 2021 11:18 am WTH did you implement between version 6.1.0 and 7.1.0 for a 275 point Elo increase With my calculations, I expect Rustic 4 to be ~2160, and Rustic 5 to hit 2300; maybe. And from there on, I'll have to see what to implement to get the largest gain.)

Hmm, if I remember correctly:

History Heuristics
Principal Variation Search
Tuned mobility
Late-move reductions
SEE pruning in search
Futility pruning

So not a bunch of features, but that's the list to beat

Mobility was interesting. The approach I've seen most people take is to count the number of "safe" squares a piece can move to, where safe means not being attacked by enemy pawns. But I took a more simplistic approach and just counted the number of legal moves a piece had, and subtracted the average number of moves a piece usually has. e.g:

Code: Select all

usBB := pos.SideBB[color]
allBB := pos.SideBB[pos.SideToMove] | pos.SideBB[pos.SideToMove^1]

moves := genRookMoves(sq, allBB) & ^usBB
mobility := int16(moves.CountBits())

eval.MGScores[color] += (mobility - 7) * PieceMobilityMG[Rook-1]
eval.EGScores[color] += (mobility - 7) * PieceMobilityEG[Rook-1]

So pieces are rewarded for having better mobility than usual, and only penalized once they have particularly bad mobility (e.g a rook trapped in the corner by a manually castled king).

And the first time I tested this idea, it was a 30-40 Elo loss, until I ran through the tuner, which flipped it around to be a ~30 Elo gain. Which was very nice, but also made me realize how bad I'm getting at manually tuning eval parameters.

The next idea I'll try at some point is to only count safe mobility and retune Blunder's eval parameters to see if that's a win.

Everything else was pretty straightforward to implement and got pretty nice gains, except for late-move reductions. I eventually got ~62 Elo from LMR, but I probably tested at least 6 different configurations before I got to that point. I eventually took Jost's advice and instead of just sticking to the standard way of combing LMR and PVS, experiment with my own approach and see what works.

So all those features should definitely be doable for rustic in the next couple of months. Right now I've been experimenting with a couple of new features, like TT-buckets (4 buckets), SEE move ordering replacing MVV-LVA, and then maybe some other form of pruning in Blunder's search.

algerbrex · Post by **algerbrex** » Sat Dec 04, 2021 5:42 pm

An update.

The past couple of weeks I've tried a couple of different ideas to improve Blunder, mostly related to evaluation. Although I'm still not happy with my king safety approach since it seemed to be a positive gain overall, I decided to keep it in and continue to improve it in the future by tuning and tinkering, per some advice I remember Amanj gave me. Other than that, Blunder now has a basic understanding of pawn structure (doubled and isolated pawns), so the play looks a bit better, but it was only worth a couple of Elo in self-play, so probably not too much in gauntlet play either.

Having worked on the evaluation, I decided to shift things over to improving the search, since I'm quite positive my current pruning and reduction constants could be improved. I'm basically going through each pruning technique and tinkering with it until I get a positive Elo gain. So far after doing some research I was able to improve the formula for calculating null-move reductions, which resulted in 23 Elo in self-play and seemed to give pretty positive results in gauntlet testing, combined with my other aforementioned changes.

Instead of using the approach of R=2 for depth <= 5 and R=3 above, I instead switched to the formula R = 3 + depth / 6, which I came across another using mentioning on here. I opted for 6 here as 4 seemed a bit to greedy and 8 seemed to conservative.

I'm now looking at getting more aggressive with my futility pruning and using a more dynamic margin. After a little testing, the current formula I settled on is MARGIN = 110 * depth, which so far seems pretty positive in terms of self-play gain. I'll have to see how it holds up in gauntlet testing.

Next up on the list will be tuning late-move reductions, and trying static-exchange-evaluation in the main search. I also look over the Stockfish source code to get ideas for new things to try in Blunder and came across their idea of the "improving" flag that's used in pruning. I believe I've encountered it before and it seems like an interesting idea, so I'll put it on my mental list of features to play with in the future.

Overall I'm pretty happy with the direction it seems development is heading, and hopefully, I'll be able to release Blunder 7.3.0 in the next couple of weeks with a pretty nice improvement over 7.2.0.

Something I do find a little interesting is that it seems the more Blunder's pruning improves, the lower its scores on the WAC test suite

the current score for the dev version is 257, and for 7.0.0 it was 279. I'm not really worried about this though, since of course making the pruning more aggressive increases the risk of missing certain ideas, but overall improves an engine's tactical strength, which seems to be the case with Blunder.

jdart · Post by **jdart** » Sat Dec 04, 2021 5:55 pm

WAC is very easy for modern engines. You should be getting over 290 correct on that at 10 seconds/move.

algerbrex · Post by **algerbrex** » Sat Dec 04, 2021 6:09 pm

jdart wrote: ↑Sat Dec 04, 2021 5:55 pm WAC is very easy for modern engines. You should be getting over 290 correct on that at 10 seconds/move.

Well, I am only using 3 seconds per move, so that may be why the score is so low. I'll try re-running the test again with 10 seconds per move and see how Blunder fares.

jdart · Post by **jdart** » Sat Dec 04, 2021 6:30 pm

algerbrex wrote: ↑Sat Dec 04, 2021 6:09 pm Well, I am only using 3 seconds per move, so that may be why the score is so low. I'll try re-running the test again with 10 seconds per move and see how Blunder fares.

Be sure also to use the version with known corrections applied, available here (among other places): https://arasanchess.org/tests.zip.

algerbrex · Post by **algerbrex** » Sat Dec 04, 2021 6:34 pm

jdart wrote: ↑Sat Dec 04, 2021 6:30 pm
algerbrex wrote: ↑Sat Dec 04, 2021 6:09 pm Well, I am only using 3 seconds per move, so that may be why the score is so low. I'll try re-running the test again with 10 seconds per move and see how Blunder fares.
Be sure also to use the version with known corrections applied, available here (among other places): https://arasanchess.org/tests.zip.

Thanks, I had no clue there were corrections. I'll replace my current WAC suite with the one you provided and try re-running the test at 10 seconds per move.

algerbrex · Post by **algerbrex** » Mon Dec 06, 2021 2:29 am

Blunder 7.3.0 has been published. Details of the release can be seen here: https://github.com/algerbrex/blunder/re ... tag/v7.3.0

For Blunder 7.4.0, I'll be experimenting with non-linear time management, more eval terms, and a better method for calculating late-move reductions. I'll also probably look into improving the search via extensions of some kind or more pruning techniques. I remember I tried implementing multi-cut for Blunder 7.1.0 but never got it working to the level I wanted. I might revisit it.

I'll also begin researching neural networks. I'm not near the point by any means where I'd like to stick a NN inside of Blunder, but I figure while I'm waiting on working on the HCE, I can improve my understanding of neural networks.

mvanthoor · Post by **mvanthoor** » Mon Dec 06, 2021 9:08 am

algerbrex wrote: ↑Mon Dec 06, 2021 2:29 am Blunder 7.3.0 has been published. Details of the release can be seen here: https://github.com/algerbrex/blunder/re ... tag/v7.3.0

Congrats

Did you implement a pawn hash to store the results of the pawn structure evaluation?

I'll also begin researching neural networks. I'm not near the point by any means where I'd like to stick a NN inside of Blunder, but I figure while I'm waiting on working on the HCE, I can improve my understanding of neural networks.

Understanding neural networks, at least the "normal" variety, isn't that difficult... the first thing (for me) would be to find out what the difference is between NNUE and a "normal" (matrix-calculation-based) NN is.

Progress on Blunder

Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder