Congratulations on the new 3.0 release
Devlog of Leorik
Moderators: hgm, Rebel, chrisw
-
- Posts: 881
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Devlog of Leorik
Thanks, Marcel! Briefly it felt like with 3.0 I was achieving a big milestone but the sense of contentment didn't last. Now there are soooo many things, again, that I want to try.
There's always going to be "just one more version" I fear!
-
- Posts: 881
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Devlog of Leorik
Before the release of version 3.0 I posted:
If you watch Leorik 3.0 play you might notice that it's a bit of a drama queen. The scores it gives are always a little too high! You could halve them and then the two engines usually agree more. This doesn't seem to affect play and I could probably just find a mapping function using a tool like https://github.com/official-stockfish/WDL_model and send more intuitive scores to the GUI.
...but if every new network comes out with a different scale that would mean each network requires me to re-tune margins. Or I just get rid of all margins like that. I think ideally search and eval are completely decoupled. Alpha/Beta works equally well on small scales as on large scales. I thought it'd be worth a shot and after a bunch of failed attempts last night I found a
solution.
Until yesteday I did something like this:
...which makes null-move pruning cheaper by avoiding the QSearch evaluation of a position that has a static eval way above beta. How much above was governed by a constant.
Now I do this:
How would the History know about when to expect a fail high, though? Well it's a new type of history I just added for that purpose. Whenever I properly evaluate after a null move and the return value is smaller than beta while the static eval of the original position was above beta I take note. What exactly was that margin? How much above beta?
This allows me to compute a mean margin and a position is expected to fail high when it's static eval is above this margin. Sure, you'll have some misspredictions: On the 300 WAC positions searched to depth 15 I missed 2.1M potential skips (played them just to find out that they indeed failed high) and I skipped 2.7M positions that wouldn't have failed high. But the vast majority of 22M positions were guessed correctly. These statistics are even a little better then what I used to get with a freshly tuned but fixed margin. And I noticed that the margin based on live statistics is actually fluctuating wildly based on the position indicating that there isn't one perfect margin equally fit for all positions anyway.
Luckily it seems like what showed favorable stats is also effective in the wild:
...I stopped the test a little early because I was sure that this was too simple for the final version. Then I started to track two different history margins based on the side to move. I also tracked the standard deviation to move the threshold a bit towards safety (e.g. by one standard deviation) and tried all kinds of heuristics in that regard, nothing clearly better then what I just explained. When in doubt I like simple and this refactoring wasn't meant to yield a lot of Elo, it was meant to kill ugly magic numbers and liberate me from need to retune them.
Anyway, it's one of the rare situations when a novel(?) idea actually worked out so I thought I should share.
The reason I couldn't find a better net was that in the search there was one eval-unit denominated margin (like you'd use them for futility pruning) and it needed re-tuning. The new value was almost twice as high as the old one! And suddenly the new networks performed as expected...
If you watch Leorik 3.0 play you might notice that it's a bit of a drama queen. The scores it gives are always a little too high! You could halve them and then the two engines usually agree more. This doesn't seem to affect play and I could probably just find a mapping function using a tool like https://github.com/official-stockfish/WDL_model and send more intuitive scores to the GUI.
...but if every new network comes out with a different scale that would mean each network requires me to re-tune margins. Or I just get rid of all margins like that. I think ideally search and eval are completely decoupled. Alpha/Beta works equally well on small scales as on large scales. I thought it'd be worth a shot and after a bunch of failed attempts last night I found a
solution.
Until yesteday I did something like this:
Code: Select all
//if remainingDepth is [1..5] a nullmove reduction of 4 will mean it goes directly into Qsearch.
//Skip the effort for obvious situations...
if (remainingDepth < 6 && eval > beta + _options.NullMoveCutoff)
return beta;
Now I do this:
Code: Select all
if (remaining < 6 && _history.IsExpectedFailHigh(eval, beta))
return beta;
This allows me to compute a mean margin and a position is expected to fail high when it's static eval is above this margin. Sure, you'll have some misspredictions: On the 300 WAC positions searched to depth 15 I missed 2.1M potential skips (played them just to find out that they indeed failed high) and I skipped 2.7M positions that wouldn't have failed high. But the vast majority of 22M positions were guessed correctly. These statistics are even a little better then what I used to get with a freshly tuned but fixed margin. And I noticed that the margin based on live statistics is actually fluctuating wildly based on the position indicating that there isn't one perfect margin equally fit for all positions anyway.
Luckily it seems like what showed favorable stats is also effective in the wild:
Code: Select all
Score of Leorik-3.0.5 vs Leorik-3.0.4: 1619 - 1501 - 5581 [0.507] 8701
Elo difference: 4.7 +/- 4.4, LOS: 98.3 %, DrawRatio: 64.1 %
Anyway, it's one of the rare situations when a novel(?) idea actually worked out so I thought I should share.
-
- Posts: 596
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Devlog of Leorik - New version 3.0
Congratulations Thomas! It's been a while since I've thought about chess programming, or touched Blunder. But I'm glad you've kept up with Leorik and implementing NNUE. The progress you've made from the earliest days of MinimalChess to now is very impressive. It's making me want to get back into chess programming. I've missed doing it. And Blunder could definitely use some re-factoring since I'm still not getting the most bang for my buck in terms of the features I have, making the whole thing feel bloated. And there were some nagging bugs that seem to just be baked into Blunder's DNA. Or maybe I'll leave Blunder alone and start up a new engine with the intention of reaching the NNUE stage.
Anyway, I look forward to playing around with Leorik 3.0 and using it as a sparring partner for my (new?) engine...eventually
-
- Posts: 881
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: Devlog of Leorik - New version 3.0
Thanks! In the past I had the problem of jumping too quickly between projects to leave anything of value. Chess programming was personally important because it allowed me to break that habit. I'll remember MMC & Leorik with contentment and pride even if I eventually move on!algerbrex wrote: ↑Sat Feb 17, 2024 11:02 pm Congratulations Thomas! It's been a while since I've thought about chess programming, or touched Blunder. But I'm glad you've kept up with Leorik and implementing NNUE. The progress you've made from the earliest days of MinimalChess to now is very impressive.
Yes come back! I missed you, too! For a time catching up with a new version of Blunder was a big motivator in my own progress!algerbrex wrote: ↑Sat Feb 17, 2024 11:02 pm It's making me want to get back into chess programming. I've missed doing it. And Blunder could definitely use some re-factoring since I'm still not getting the most bang for my buck in terms of the features I have, making the whole thing feel bloated. And there were some nagging bugs that seem to just be baked into Blunder's DNA. Or maybe I'll leave Blunder alone and start up a new engine with the intention of reaching the NNUE stage.
I've got the hunch that you'll love NNUE's because it's really the next level of what you already did with gradient descent and yet surprisingly approachable. Opens up new interesting questions too like thinking about network architectures, data generation and filtering...
Leorik 2.X should have you covered with versions each about 100 Elo apart until 2900 Elo
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: Devlog of Leorik - New version 3.0
Welcome back. If you think refactoring Blunder would cause you to basically rewrite the engine, you should probably Go and leave it to Rust and C if you can Swift-ly bring Clojure to a new project.algerbrex wrote: ↑Sat Feb 17, 2024 11:02 pm It's making me want to get back into chess programming. I've missed doing it. And Blunder could definitely use some re-factoring since I'm still not getting the most bang for my buck in terms of the features I have, making the whole thing feel bloated. And there were some nagging bugs that seem to just be baked into Blunder's DNA. Or maybe I'll leave Blunder alone and start up a new engine with the intention of reaching the NNUE stage.
Dang. That was nerdy.
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: Devlog of Leorik - New version 3.0
Bro that was the smoothest string of puns I've ever seen
Creator of Maxwell
-
- Posts: 596
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Devlog of Leorik - New version 3.0
Luckily we're all nerds here, solid punsmvanthoor wrote: ↑Tue Feb 20, 2024 11:54 pmWelcome back. If you think refactoring Blunder would cause you to basically rewrite the engine, you should probably Go and leave it to Rust and C if you can Swift-ly bring Clojure to a new project.algerbrex wrote: ↑Sat Feb 17, 2024 11:02 pm It's making me want to get back into chess programming. I've missed doing it. And Blunder could definitely use some re-factoring since I'm still not getting the most bang for my buck in terms of the features I have, making the whole thing feel bloated. And there were some nagging bugs that seem to just be baked into Blunder's DNA. Or maybe I'll leave Blunder alone and start up a new engine with the intention of reaching the NNUE stage.
Dang. That was nerdy.
-
- Posts: 596
- Joined: Sun May 30, 2021 5:03 am
- Location: United States
- Full name: Christian Dean
Re: Devlog of Leorik - New version 3.0
I'm the same way, and Blunder was the first project I consistently worked on for as long as did, over a year and a half.lithander wrote: ↑Sun Feb 18, 2024 2:02 pm Thanks! In the past I had the problem of jumping too quickly between projects to leave anything of value. Chess programming was personally important because it allowed me to break that habit. I'll remember MMC & Leorik with contentment and pride even if I eventually move on!
Ah, now I have to come backlithander wrote: ↑Sun Feb 18, 2024 2:02 pm Yes come back! I missed you, too! For a time catching up with a new version of Blunder was a big motivator in my own progress!
I've got the hunch that you'll love NNUE's because it's really the next level of what you already did with gradient descent and yet surprisingly approachable. Opens up new interesting questions too like thinking about network architectures, data generation and filtering...
I think I'll like NNUE's a lot too, especially since I hopefully have enough mathematical maturity under my belt now to understand them deeply, or at least well enough to implement them from scratch. And since of course NNUE's utilize a lot of math like linear algebra and calculus, which I like I think I'm going to spend the rest of this week deciding how I want to get back into the engine programming world.
I'll probably end up starting from scratch since, while I love Blunder, I feel like its capabilities were a little limited by the design decisions I choose, and my not-so-great understanding of some of the search algorithms I plugged into the engine at the time
I'll just need to decide if I want to stick with Go or use this as an excuse to learn another language...
Sounds like a plan
-
- Posts: 1784
- Joined: Wed Jul 03, 2019 4:42 pm
- Location: Netherlands
- Full name: Marcel Vanthoor
Re: Devlog of Leorik - New version 3.0
That wasn't my intention... as all programmers know, leaving a String of puns to Float around in your codebase will create Real problems in a Class of their own. You need to Double down on those and address them in Short order, or they'll come back to Byte you in the ass.eboatwright wrote: ↑Wed Feb 21, 2024 12:39 am Bro that was the smoothest string of puns I've ever seen
Ok. I'll let myself out now. But you asked for it