Progress on Blunder

algerbrex · Post by **algerbrex** » Fri Dec 17, 2021 8:23 pm

mvanthoor wrote: ↑Fri Dec 17, 2021 12:49 pm But I digress... this is Blunder's progress thread.

No worries. I was enjoying the discussion. This is something I hadn't really even considered.

algerbrex · Post by **algerbrex** » Sun Dec 19, 2021 5:57 am

Since the release of Blunder 7.4.0, I've been making slow but decent progress towards 2600.

Most of the changes so far have come from tweaking and refining features already in Blunder, and the current development version should be around 30 Elo stronger than 7.4.0, which hopefully puts it close to 2600 on the CCRL, and puts it at 2570 from my own gauntlet testing at bullet time controls. So I'll just wait and see where it falls on the CCRL whenever testers get an opportunity to slip it in.

One of the biggest changes so far has been re-working the criteria for Blunder using certain contempt factors during the middle and endgame. In the early days of Blunder, I remember it wasn't enough to avoid repetition detection by just returning a value of zero when my engine started repeating moves. So for the middle game, I returned a -25 cp penalty for drawing, and only returned a draw value of 0 in the endgame.

The gain here was in better tweaking when Blunder decided it crossed over from the middle to the endgame, as I remember watching several games Blunder played which were drawn endgames, and I would've been perfectly happy to see Blunder hold the draw, but the engine was still penalizing itself for drawing. Seeing this I imagined with my previous configuration, I was punishing Blunder too harshly and causing it to try to strive for a win too much is drawn endgames and consequently losing. And I seemed to be correct, and making a small tweak to this code gained Blunder a respectable 10 Elo.

As for the next ideas I'm going to be trying I'm honestly not too sure, and I'm really starting to miss that day where Elo gains came in 50, 100, or even 200 point chunks

But on a positive note, I am taking this as a sign that Blunder's starting to move into a more mature development stage. I'll probably be adding a pawn hash in at some point, if not now then in the next version, and I'll probably try some little tweaks in the search here and there to see if any of them are worth 10-15 Elo. I also plan on experimenting with a more aggressive LMR formula soon, and I'll probably try re-tuning the evaluation with one of Etheral's datasets, after adding in some more positional knowledge like rooks on semi-open files and the bishop pair.

When looking at long-term goals, I've already come a lot farther than I ever thought I could when I started 6-7 months ago. Honestly speaking, while I'm sure it's very fun to have a 3000+ Elo program, I'd be quite satisfied if I can get Blunder to break 2800, playing around the level of a super-GM. I certainly won't stop the development of Blunder when hit this milestone, but I'll probably also start focusing more on making Blunder more user-friendly. I'm not entirely sure why, as I doubt many players would care to use yet another chess engine when they have plenty of other good options available. But for the few that I hope will use Blunder to analyze their games or play against, I'd like it to be an enjoyable piece of software to work with.

I'm also not sure when I'll finally stop doing Blunder 7.x.x, and make another major release of Blunder. Ideally, this would probably be whenever I start touching neural network stuff. I likely won't start with NNUE, although I will be studying it, and I'd like to try working with a simple neural network first, which is what I believe Zurichess started off using...or still uses? Correct me if I'm wrong here.

Anyhow, enough rambling. If I stick to my unofficial pattern of the last few minor releases, I'll be releasing Blunder 7.5.0 after it reaches 50 Elo over 7.4.0, which if current development keep up, will probably be some time right before Christmas, at which point I'll be taking a break from development for a while since I'll be spending some time with family, friends, my girlfriend, and getting ready for my second semester of Freshman year.

algerbrex · Post by **algerbrex** » Sat Dec 25, 2021 2:04 pm

Merry Christmas and Happy Holidays everyone!

Blunder 7.5.0 has been released: https://github.com/algerbrex/blunder/re ... tag/v7.5.0

It'll probably be the last version release in a while since I won't be doing much work on it over the holidays, and I'll be quite busy starting school again in the spring.

Not sure who I'm kidding though. I'll probably find some time somehow to keep the addiction going

algerbrex · Post by **algerbrex** » Thu Dec 30, 2021 8:38 am

Hello everyone! Blunder 7.6.0 has been released: https://github.com/algerbrex/blunder/re ... tag/v7.6.0. Release notes can be found in the given link.

Self-play shows it's about 80 Elo stronger than 7.5.0, which means I should've reached a milestone and crossed the 2600 Elo mark! Hopefully by a pretty good margin.

On a side-note, I'm starting to realize more and more that once your engine gets to around 2400-2500, improvements come more from tweaking and perfecting already existing features...which can be a little tedious, to say the least

since before it may have only taken 500-1500 games to verify a change is an Elo gain (with SPRT testing), and now it's starting to take 3000-8000 games.

At some point, I may just need to bite the bullet and build some sort of small testing cluster, since I soon anticipate I'll need to regularly be running a couple of thousand games to verify a search or evaluation patch.

Graham Banks · Post by **Graham Banks** » Thu Dec 30, 2021 9:01 am

algerbrex wrote: ↑Thu Dec 30, 2021 8:38 am Hello everyone! Blunder 7.6.0 has been released: https://github.com/algerbrex/blunder/re ... tag/v7.6.0. Release notes can be found in the given link.

Self-play shows it's about 80 Elo stronger than 7.5.0, which means I should've reached a milestone and crossed the 2600 Elo mark! Hopefully by a pretty good margin.

On a side-note, I'm starting to realize more and more that once your engine gets to around 2400-2500, improvements come more from tweaking and perfecting already existing features...which can be a little tedious, to say the least since before it may have only taken 500-1500 games to verify a change is an Elo gain (with SPRT testing), and now it's starting to take 3000-8000 games.

At some point, I may just need to bite the bullet and build some sort of small testing cluster, since I soon anticipate I'll need to regularly be running a couple of thousand games to verify a search or evaluation patch.

Playing in Division 7 now.

algerbrex · Post by **algerbrex** » Thu Dec 30, 2021 9:02 am

Graham Banks wrote: ↑Thu Dec 30, 2021 9:01 am
algerbrex wrote: ↑Thu Dec 30, 2021 8:38 am Hello everyone! Blunder 7.6.0 has been released: https://github.com/algerbrex/blunder/re ... tag/v7.6.0. Release notes can be found in the given link.

Self-play shows it's about 80 Elo stronger than 7.5.0, which means I should've reached a milestone and crossed the 2600 Elo mark! Hopefully by a pretty good margin.

On a side-note, I'm starting to realize more and more that once your engine gets to around 2400-2500, improvements come more from tweaking and perfecting already existing features...which can be a little tedious, to say the least since before it may have only taken 500-1500 games to verify a change is an Elo gain (with SPRT testing), and now it's starting to take 3000-8000 games.

At some point, I may just need to bite the bullet and build some sort of small testing cluster, since I soon anticipate I'll need to regularly be running a couple of thousand games to verify a search or evaluation patch.
Playing in Division 7 now.

Well, thanks. You didn't have to if it was too much trouble! I didn't expect you to go to the trouble of doing *another* version change

algerbrex · Post by **algerbrex** » Thu Dec 30, 2021 9:07 am

Since I estimate Blunder's Elo is around 2640, I decided to look for another engine on the CCRL for Blunder to spar against and found Rodin, which is an engine I've heard of before but haven't installed yet.

In the game they just played, Rodin found a beautiful rook sacrifice after Blunder...er....blundered...to get a perpetual check in a lost position! The sacrifice occurs on move 53:

[pgn]
[Event "?"]
[Site "?"]
[Date "2021.12.30"]
[Round "?"]
[White "Rodin 8.00"]
[Black "Blunder dev"]
[Result "1/2-1/2"]
[ECO "C42"]
[GameDuration "00:07:24"]
[GameEndTime "2021-12-30T02:00:22.885 Central Standard Time"]
[GameStartTime "2021-12-30T01:52:58.093 Central Standard Time"]
[Opening "Petrov"]
[PlyCount "173"]
[TimeControl "40/120"]
[Variation "Damiano Variation"]

1. e4 {+0.20/13 3.0s} e5 {-0.23/16 3.9s} 2. Nf3 {+0.20/13 1.7s}
Nf6 {-0.23/15 3.0s} 3. Nxe5 {+0.24/13 2.8s} Nxe4 {-0.18/14 3.0s}
4. Qe2 {+0.56/14 6.3s} Qe7 {-0.73/18 6.5s} 5. Qxe4 {+0.52/13 0.96s}
d6 {-0.57/17 2.9s} 6. d4 {+0.52/13 1.5s} dxe5 {-0.72/16 2.9s}
7. dxe5 {+0.52/13 1.6s} Nd7 {-0.58/15 2.9s} 8. Bf4 {+0.72/13 3.0s}
g5 {-0.59/16 2.9s} 9. Be3 {+0.44/12 1.9s} Nxe5 {-0.33/15 6.3s}
10. Be2 {+0.24/14 8.9s} c6 {-0.14/16 4.7s} 11. h4 {+0.32/13 3.1s}
g4 {-0.21/17 7.7s} 12. Nc3 {+0.32/14 3.8s} Bg7 {-0.21/15 3.3s}
13. O-O-O {+0.32/13 1.4s} O-O {-0.06/15 2.5s} 14. Bg5 {+0.20/13 3.4s}
Qe6 {-0.07/14 3.3s} 15. Kb1 {+0.28/14 14s} f5 {+0.03/16 2.5s}
16. Qb4 {+0.36/13 1.1s} a5 {+0.17/16 2.5s} 17. Qc5 {+0.36/13 0.86s}
f4 {+0.11/15 2.5s} 18. Rhe1 {+0.76/11 0.85s} b6 {+0.10/14 2.5s}
19. Qa3 {+0.64/12 2.9s} f3 {+0.39/13 2.5s} 20. gxf3 {+0.84/12 0.95s}
gxf3 {+0.13/15 2.5s} 21. Bf1 {+1.00/12 1.3s} Qf7 {+0.03/14 3.2s}
22. Ne4 {+1.12/13 2.0s} Be6 {-0.08/12 2.4s} 23. Nd6 {+1.20/13 1.1s}
Qg6 {-0.43/15 4.1s} 24. Qe3 {+1.16/14 4.3s} Bd5 {-0.25/17 3.9s}
25. Be7 {+1.00/13 1.2s} Ng4 {-0.25/17 2.9s} 26. Qa3 {+0.56/14 3.0s}
Nxf2 {+1.20/16 2.2s} 27. Rd2 {+0.64/14 1.9s} Ng4 {+1.25/15 3.7s}
28. Bd3 {+0.72/13 2.6s} Qh5 {+1.64/14 3.4s} 29. Rf1 {+0.01/13 7.1s}
f2 {+2.13/14 2.0s} 30. Be2 {+0.60/13 3.4s} Qg6 {+0.65/15 2.7s}
31. h5 {+0.56/12 1.9s} Qe6 {+1.51/14 2.4s} 32. Bxf8 {-0.20/12 1.8s}
Rxf8 {+0.89/16 2.1s} 33. Bxg4 {-1.04/13 2.1s} Qxg4 {+1.11/17 1.8s}
34. Qe3 {-1.24/13 1.7s} Qb4 {+1.05/16 1.6s} 35. c3 {-1.44/16 6.6s}
Qxd6 {+1.89/13 1.4s} 36. Rdxf2 {-1.44/15 0.25s} Bf7 {+1.88/15 1.9s}
37. Qxb6 {-1.04/11 0.73s} Qd5 {+2.46/12 0.96s} 38. a3 {-2.20/10 0.79s}
a4 {+2.51/14 0.85s} 39. Rxf7 {-2.12/10 0.34s} Qe4+ {+2.43/16 0.74s}
40. Ka1 {-2.08/11 0.19s} Rxf7 {+2.45/16 0.65s} 41. Rg1 {-2.16/14 1.9s}
Qc4 {+2.48/16 4.0s} 42. Qd8+ {-2.20/14 2.0s} Rf8 {+2.17/19 3.1s}
43. Qd1 {-2.20/15 1.9s} Kh8 {+2.35/19 5.2s} 44. Qd6 {-2.12/16 6.2s}
Rf5 {+2.28/18 3.0s} 45. Qd7 {-2.04/15 1.2s} Rf7 {+2.14/19 3.0s}
46. Qd6 {-2.24/16 5.6s} h6 {+2.43/16 3.0s} 47. Qg6 {-2.08/15 1.1s}
c5 {+2.22/19 5.1s} 48. Qg2 {-2.20/15 1.4s} Qb5 {+2.32/17 3.9s}
49. Ka2 {-2.24/15 2.4s} Rb7 {+2.61/19 3.8s} 50. Ka1 {-2.20/16 2.1s}
Qb3 {+2.61/19 3.8s} 51. Qe2 {-2.20/16 1.8s} Rb8 {+2.53/18 3.8s}
52. Qd2 {-2.20/15 3.1s} c4 {+2.57/19 3.7s} 53. Rxg7 {+0.01/20 4.7s}
Kxg7 {+4.00/16 3.7s} 54. Qd4+ {+0.01/20 0.58s} Kf7 {+4.43/16 2.8s}
55. Qd7+ {+0.01/20 1.2s} Kf6 {+4.07/15 2.8s} 56. Qd6+ {+0.01/20 1.4s}
Kf5 {+3.58/16 3.6s} 57. Qd5+ {+0.01/20 1.1s} Kf4 {-0.25/32 4.7s}
58. Qd6+ {+0.01/21 1.3s} Kg4 {-0.25/33 3.5s} 59. Qe6+ {+0.01/21 1.4s}
Kxh5 {-0.25/32 3.4s} 60. Qf5+ {+0.01/20 0.96s} Kh4 {-0.25/36 2.6s}
61. Qf6+ {+0.01/21 0.93s} Kg4 {-0.25/34 4.4s} 62. Qg7+ {+0.01/21 1.5s}
Kf5 {-0.25/29 3.3s} 63. Qf7+ {+0.01/21 1.3s} Ke4 {-0.25/29 3.2s}
64. Qe6+ {+0.01/22 1.3s} Kf3 {-0.25/33 2.4s} 65. Qf5+ {+0.01/22 0.91s}
Kg3 {-0.25/32 2.4s} 66. Qe5+ {+0.01/23 1.2s} Kg2 {-0.25/31 2.4s}
67. Qe4+ {+0.01/21 1.5s} Kh3 {-0.25/35 2.4s} 68. Qe6+ {+0.01/22 0.82s}
Kh2 {-0.25/34 3.1s} 69. Qe2+ {+0.01/22 0.95s} Kh1 {-0.25/36 3.1s}
70. Qf1+ {+0.01/23 0.81s} Kh2 {-0.25/42 2.3s} 71. Qe2+ {+0.01/25 0.90s}
Kg1 {-0.25/40 2.9s} 72. Qg4+ {+0.01/24 1.2s} Kf2 {-0.25/34 2.5s}
73. Qf5+ {+0.01/23 1.4s} Ke3 {-0.25/34 2.2s} 74. Qe6+ {+0.01/23 1.4s}
Kf4 {-0.25/34 1.9s} 75. Qf6+ {+0.01/22 1.2s} Kg3 {-0.25/35 2.2s}
76. Qe5+ {+0.01/25 1.00s} Kf3 {-0.25/35 2.2s} 77. Qf5+ {+0.01/25 0.97s}
Ke2 {-0.25/36 1.1s} 78. Qg4+ {+0.01/22 1.1s} Kd2 {-0.25/35 0.99s}
79. Qd4+ {+0.01/24 1.3s} Kc2 {-0.25/37 0.86s} 80. Qe4+ {+0.01/23 1.1s}
Kd2 {-0.25/38 5.5s} 81. Qd4+ {+0.01/24 1.1s} Kc1 {-0.25/34 3.9s}
82. Qf4+ {+0.01/24 0.92s} Kd1 {-0.25/35 3.0s} 83. Qd4+ {+0.01/24 1.3s}
Ke1 {-0.25/37 3.9s} 84. Qe3+ {+0.01/24 1.3s} Kf1 {-0.25/37 3.9s}
85. Qf3+ {+0.01/24 1.0s} Kg1 {-0.25/39 2.9s} 86. Qg4+ {+0.01/26 1.4s}
Kh2 {-0.25/44 3.8s} 87. Qe2+ {+0.01/26 1.4s, Draw by 3-fold repetition} 1/2-1/2
[/pgn]

algerbrex · Post by **algerbrex** » Sun Jan 30, 2022 1:14 am

A quick update.

I've been very busy with school and life this past month, and have found basically no time to work on Blunder. The few strength-gaining patches I've actually kept amount to ~10-15 Elo, with the biggest change to the codebase actually being the implementation of a feature that breaks down the evaluation into specific categories.

Right now, I'm not entirely sure what the next steps with Blunder will be. I have a list of features to try, which will hopefully gain another 50-60 Elo for another version release. Or I'll re-write/refactor Blunder from the ground up, and see what I can come up with. Right now the latter sounds more challenging and more fun for me, but re-development would be quite slow and tedious. Still, I'm trying to write a strong engine, regardless of the time frame required, so this may be the option I end up pursuing.

On top of this, I'm also going to be doing a series of Texel tuning experiments over the next several months, which will be posted here: forum3/viewtopic.php?f=7&t=78536. I'm hoping I'll find something interesting, or at the very least, get a good idea of what never to do. This also would align quite nicely with re-building Blunder from the ground up, as it would be very cool in my mind to start from scratch, using my own set of poor yet entirely hand-crafted PST, and reach Blunder 7.6.0's strength level, by tuning the engine over and over using its own games.

What I know I'll be putting off for a bit more are neural networks, although I understand Blunder is it a perfectly fine level where the feature would be perfectly reasonable to implement. I'd still rather squeeze what I can out of a custom HCE, and then see what Elo boost I can get from there when that's finally exhausted.

But I think a neural network of some sort will be coming soon, within the year. With Blunder, I've always tried to follow the policy that I only add in a feature if I actually understand what it's doing and how it affects the search tree/evaluation, and NNs are no exception. With that said, I'm currently reading through several good resources to understand NNs in-depth, as well as working my way through linear algebra this semester, both of which I hope will give me a solid foundation, mathematical and theoretical, for understanding how NNs work.

lithander · Post by **lithander** » Sun Jan 30, 2022 7:08 pm

I also think an engine that learns to play chess is more interesting (technically) than one that has a lot of human chess knowledge written into an HCE. And NNUE has proven this approach to be valid. But also just tuning PSTs from a set of annotated games is a form of machine learning. And when I trained my tables I started with just the basic material values that every human child learns together with the rules. The nuances, how to play an opening and win an endgame, all that was something the engine learned "itself" and not from me.

And I think there's a huge blank space between simple PSTs and NNUE that I would be curious to explore. Actually in MinimalChess I dabbled with it a little. I added this 13th table that encodes knowledge about mobility and pawnstructure and attacking and threatening pieces completely automatically via the tuner. I just supplied the table and how to use it but initialized it with zeros in which case it does not affect the evaluation at all. But after a short tuning pass it was worth a good bit of ELO.

I have also written some handcrafted evaluation terms but never commited them to the master branch even though they added strength. And so with Leorik I plan to climb the Elo ladder with machine learning techniques but step by step. I imagine to explore the whole space between PST and NNUE.

And maybe I'll try some fun distractions too. For example imagine a simulated evolution where you mutate the PSTs and let the engines that win procreate!

mvanthoor · Post by **mvanthoor** » Mon Jan 31, 2022 1:08 am

algerbrex wrote: ↑Thu Dec 30, 2021 8:38 am On a side-note, I'm starting to realize more and more that once your engine gets to around 2400-2500, improvements come more from tweaking and perfecting already existing features...which can be a little tedious, to say the least since before it may have only taken 500-1500 games to verify a change is an Elo gain (with SPRT testing), and now it's starting to take 3000-8000 games.

That's a lot of games.... is my math really off when I count the things I still need to implement?

- Rustic 4 (Tapered Evaluation): 2160
- Rustic 5 (Null move pruning + static null move pruning): + 140 => 2300
- Rustic 6 (Three-stage move generatiion) +40 => 2340
- Rustic 7 (Aspiration Window + History) +40 (? going by other engines) => 2380
- Rustic 8 (LMR) +100 (? going by other engines) => 2480

Then there are some things such as futility pruning and SEE that I haven't even thought about. There's no pawn hash, pawn structure evaluation, king safety, or whatever... no evaluation yet on top of the PST's. Still I expect Rustic to be able to hit 2400 - 2450 at least, without even touching the evaluation.

But, maybe my math is completely off, because most other engines develop both the search and the evaluation in tandem and the mentioned Elo ratings for those engines aren't even close for mine. We'll see.

At some point, I may just need to bite the bullet and build some sort of small testing cluster, since I soon anticipate I'll need to regularly be running a couple of thousand games to verify a search or evaluation patch.

What do you have in mind for this testing cluster? I am looking at something that is as cheap as possible, with as many cores as possible. Because I've moved my DGT board to a NUC, I have two RPi4's lying around. Maybe I could build an 8x 4Pi testing cluster somehow, to test with 32 cores, but the Pi 4 gets really hot. If not, I'll go ahead and build a test computer on the basis of an AMD Ryzen 5700G (8 cores) into a mini-ITX case as a dedicated testing rig. (I'd not recommend an 8-core ASUS or Gigabyte NUC-like computer for this, because those systems have fans that scream your head off when the CPU gets hot. Better build a somewhat bigger system using the biggest cooler and fan you can fit into a case as small as possible.)

Progress on Blunder

Re: Progress on Blunder

Re: Progress on Blunder

Blunder 7.5.0 Released

Blunder 7.6.0 Released

Re: Blunder 7.6.0 Released

Re: Blunder 7.6.0 Released

A Beautiful Sacrifice

Re: Progress on Blunder

Re: Progress on Blunder

Re: Blunder 7.6.0 Released