Devlog of Leorik

lithander · Post by **lithander** » Thu Sep 28, 2023 1:15 am

mathmoi wrote: ↑Wed Sep 27, 2023 7:21 pm Did you by any chance forget to reverse the returned value of your qsearch (assuming negamax) before ordering? That would explain why the reverse order performed better.

That was my first suspicion as well. But I stepped through the code pretty carefully. The version that won the 10k games match was in the "wrong" order but self-correcting in the first iteration because almost all moves were beating alpha set by the previous move. So after depth 1 search concludes they are almost in the same order as if you'd order them best to worst. My best working theory is that the few moves that had an equal QSearch value as their precursor and thus ended up at the far end of the root move list are for some reason really better searched late. Maybe a good strategy is to put average moves last but good and bad moves to the front. Good moves because they could actually replace the current PV move and you want that to happen as early as possible because when you raise alpha refuting the remaining moves becomes cheaper and bad moves because they are cheap to refute in any case.

lithander · Post by **lithander** » Thu Sep 28, 2023 1:29 am

I just released Version 2.5

Code: Select all

Score of Leorik-2.5 vs Leorik-2.4: 1048 - 319 - 633  [0.682] 2000
...      Leorik-2.5 playing White: 544 - 135 - 321  [0.705] 1000
...      Leorik-2.5 playing Black: 504 - 184 - 312  [0.660] 1000
...      White vs Black: 728 - 639 - 633  [0.522] 2000
Elo difference: 132.7 +/- 13.0, LOS: 100.0 %, DrawRatio: 31.6 %

I have written about pretty much all new features already somewhere in this devlog but to recap the most important ones:

I have migrated from .Net 6 to .Net 8 for better performance and compatibility with the latest .Net features.
Leorik now employs a faster move generator based on PEXT by default but also includes the previous and other sliding-piece move generators, that are selectable via compiler define.
Advanced Piece-Square Values: Leorik 2.5 replaces the standard PSQTs with linear functions that calculate the piece-square values using 18 parameters reflecting the positions of both kings and the game phase. Computing these formulas on the fly at acceptable speed requires AVX2 hardware support, enabling the engine to execute multiply-and-add operations on 8 values simultaneously in a single instruction.
Parallel Search: Leorik now supports the 'Threads' option in the Universal Chess Interface (UCI). The engine will launch a separate search instance on each thread. Search instances can share information with each others through a lockless transposition table.

Last but not least Leorik is now published under the permissive MIT License. Feel free to fork and play around with the source. 90% of the ideas I tried lately failed or only yielded negligible improvements so if someone with great ideas but no chess engine of their own wants to impact the future development now you can do that without any worry of trespassing. (*cough* Daniel *cough*

The new version uses all kind of exotic intrinsics like PEXT and AVX2. I made RasPi, Linux and Mac compiles but couldn't really test them. So let me know if it still works on your computer and maybe share the 'nps' reported by v2.5 vs v2.4. For reference and using 'go depth 20' as a benchmark I get 5.0M nps with version 2.5 (PEXT) and 6.9M nps with version 2.4 on my AMD 5900X.

Guenther · Post by **Guenther** » Thu Sep 28, 2023 11:52 am

lithander wrote: ↑Thu Sep 28, 2023 1:29 am I just released Version 2.5
Code: Select all
Score of Leorik-2.5 vs Leorik-2.4: 1048 - 319 - 633  [0.682] 2000
...      Leorik-2.5 playing White: 544 - 135 - 321  [0.705] 1000
...      Leorik-2.5 playing Black: 504 - 184 - 312  [0.660] 1000
...      White vs Black: 728 - 639 - 633  [0.522] 2000
Elo difference: 132.7 +/- 13.0, LOS: 100.0 %, DrawRatio: 31.6 %
I have written about pretty much all new features already somewhere in this devlog but to recap the most important ones:

I have migrated from .Net 6 to .Net 8 for better performance and compatibility with the latest .Net features.

Leorik now employs a faster move generator based on PEXT by default but also includes the previous and other sliding-piece move generators, that are selectable via compiler define.

Advanced Piece-Square Values: Leorik 2.5 replaces the standard PSQTs with linear functions that calculate the piece-square values using 18 parameters reflecting the positions of both kings and the game phase. Computing these formulas on the fly at acceptable speed requires AVX2 hardware support, enabling the engine to execute multiply-and-add operations on 8 values simultaneously in a single instruction.

Parallel Search: Leorik now supports the 'Threads' option in the Universal Chess Interface (UCI). The engine will launch a separate search instance on each thread. Search instances can share information with each others through a lockless transposition table.

Last but not least Leorik is now published under the permissive MIT License. Feel free to fork and play around with the source. 90% of the ideas I tried lately failed or only yielded negligible improvements so if someone with great ideas but no chess engine of their own wants to impact the future development now you can do that without any worry of trespassing. (*cough* Daniel *cough* ;)

The new version uses all kind of exotic intrinsics like PEXT and AVX2. I made RasPi, Linux and Mac compiles but couldn't really test them. So let me know if it still works on your computer and maybe share the 'nps' reported by v2.5 vs v2.4. For reference and using 'go depth 20' as a benchmark I get 5.0M nps with version 2.5 (PEXT) and 6.9M nps with version 2.4 on my AMD 5900X.

Thanks for the new release Thomas!

I have some information, which is hopefully in place here (otherwise you can mention it on your github).
A few days ago I tried to compile Leorik 2.5 on my old hardware (with Win7!) and succeeded by changing the target to .Net7 because
I was too lazy to update my .Net version again and I had the feeling you would soon release an official one anyway :)

The compilation went smoothly, but trying to run the binary immediately crashed my machine and I had to shut it down.
Well, I thought I wait for the new official release and now I tried it today and it crashed my machine too ;)
All symptoms were like an enormous amount of memory being tried to be allocated, sth like > 4GB. (ofc I couldn't check the task manager anymore)

Heck, I had already planned before to finally buy a new one in late fall/winter, but as usual I was too curious what might be the reason
and if there might be a fix/workaround.

So here it is (found by a quick google research):
https://github.com/dotnet/runtime/issues/79469
(crazy memory behaviour in Win7 since .Net7)

The exact solution is in this:
https://github.com/dotnet/runtime/issue ... 1371202114

There is only one way to make .NET 7 apps to work in Win7 is to set system environment variables
set DOTNET_GCName=clrgc.dll
set DOTNET_EnableWriteXorExecute=0

Edit
another explanation:
https://github.com/dotnet/msbuild/issue ... 1666903883

After this changes the Leorik (win classic = for old hardware) 2.5 binary worked normally here.

Code: Select all

Leorik 2.5 Classic
uci
id name Leorik 2.5
id author Thomas Jahn
option name Hash type spin default 50 min 1 max 2047
option name Threads type spin default 1 min 1 max 8
uciok
isready
readyok
position startpos
go infinte
info depth 1 score cp 34 nodes 21 nps 2100 time 10 pv e2e4
info depth 2 score cp 0 nodes 90 nps 2647 time 34 pv e2e4 g8f6
info depth 3 score cp 28 nodes 260 nps 7027 time 37 pv g1f3 g8f6 c2c4
info depth 4 score cp 0 nodes 612 nps 15692 time 39 pv g1f3 g8f6 c2c4 c7c5
info depth 5 score cp 26 nodes 940 nps 22926 time 41 pv g1f3 g8f6 d2d4 d7d5 b1d2
info depth 6 score cp 0 nodes 1767 nps 39266 time 45 pv g1f3 d7d5 d2d4 g8f6 b1d2 b8d7
info depth 7 score cp 34 nodes 5904 nps 85565 time 69 pv d2d4 d7d5 b1d2 g8f6 c2c4 e7e6 g1f3
info depth 8 score cp 32 nodes 28493 nps 158294 time 180 pv e2e4 e7e5 g1f3 b8c6 d2d4 e5d4 f3d4 d7d5
info depth 9 score cp 36 nodes 37494 nps 166640 time 225 pv e2e4 e7e5 g1f3 b8c6 d2d4 e5d4 f3d4 d7d5 b1c3
info depth 10 score cp 22 nodes 54963 nps 177873 time 309 pv e2e4 e7e5 g1f3 d7d5 f3e5 d5e4 d2d4 f7f5 f1b5 c7c6
info depth 11 score cp 35 nodes 99692 nps 201397 time 495 pv e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 f1b5 c6d4 e1g1 f8d6 f3d4
info depth 12 score cp 31 nodes 157788 nps 254086 time 621 pv e2e4 e7e5 g1f3 g8f6 b1c3 b8c6 d2d4 e5d4 f3d4 c6d4 d1d4 f8d6
info depth 13 score cp 33 nodes 313350 nps 352078 time 890 pv e2e4 e7e5 g1f3 g8f6 f3e5 d7d6 e5f3 f6e4 d2d3 e4f6 d3d4 d6d5 f1d3
info depth 14 score cp 5 nodes 849543 nps 585891 time 1450 pv e2e4 e7e5 g1f3 g8f6 b1c3 f8b4 f3e5 d7d6 e5c4 b4c3 d2c3 f6e4 f1d3 d8h4
info depth 15 score cp 34 nodes 1460008 nps 716744 time 2037 pv e2e4 e7e5 g1f3 g8f6 b1c3 f8b4 f1c4 e8g8 e1g1 d7d6 f1e1 f8e8 d2d3 b8c6 h2h3
info depth 16 score cp 15 nodes 2763188 nps 838854 time 3294 pv e2e4 e7e5 g1f3 g8f6 b1c3 f8b4 f1c4 b4c3 d2c3 d7d6 d1e2 b8c6 e1g1 e8g8 c1g5 h7h6
info depth 17 score cp 33 nodes 5708716 nps 921057 time 6198 pv e2e4 e7e5 g1f3 b8c6 d2d4 e5d4 f3d4 g8f6 d4c6 b7c6 e4e5 d8e7 d1e2 f6d5 e2h5 e7b4 c2c3
info depth 18 score cp 19 nodes 10336446 nps 960279 time 10764 pv e2e4 e7e5 g1f3 b8c6 f1c4 g8f6 f3g5 d7d5 e4d5 c6a5 c4b5 c7c6 d5c6 b7c6 b5e2 f8e7 e1g1 e8g8
info depth 19 score cp 27 nodes 19468381 nps 985840 time 19748 pv e2e4 e7e5 g1f3 g8f6 f3e5 d7d6 e5f3 f6e4 d2d4 d6d5 f1d3 f8e7 b1c3 f7f5 e1g1 e8g8 c1f4 b8c6 h2h3

So if you have an old hardware with old Win7 (a likely combination) and Leorik immediately crashes your comp don't resign,
but do the above changes.

Guenther

JoAnnP38 · Post by **JoAnnP38** » Fri Sep 29, 2023 1:33 am

lithander wrote: ↑Thu Sep 28, 2023 1:29 am I just released Version 2.5

Congratulations! I bet Graham and Lars are loving all the new engine versions that have been released recently. Releasing a Leorik with multi-threaded search will definitely motivate me to finally address this short-coming in Pedantic. Currently, I am tearing my evaluation apart and rebuilding because it just needed to be cleaned up. Hopefully, I can garner a point or two of Elo from the expected performance enhancement.

lithander · Post by **lithander** » Fri Sep 29, 2023 3:00 am

Guenther wrote: ↑Thu Sep 28, 2023 11:52 am So if you have an old hardware with old Win7 (a likely combination) and Leorik immediately crashes your comp don't resign,
but do the above changes.

I've never heard of that trick! It's great to know that Leorik is still usable on older systems and even Windows 7!

Guenther wrote: ↑Thu Sep 28, 2023 11:52 am After this changes the Leorik (win classic = for old hardware) 2.5 binary worked normally here.
Code: Select all
info depth 19 score cp 27 nodes 19468381 nps 985840 time 19748 pv e2e4 e7e5 g1f3 g8f6 f3e5 d7d6 e5f3 f6e4 d2d4 d6d5 f1d3 f8e7 b1c3 f7f5 e1g1 e8g8 c1f4 b8c6 h2h3

Less than 1M nodes per second... I wonder if the the AVX2 based evaluation is to blame here. Do you still have Leorik 2.4 on that computer? If yes, I'd be curious to see how much 'nps' that version achieves on the same position.

JoAnnP38 wrote: ↑Fri Sep 29, 2023 1:33 am Congratulations! I bet Graham and Lars are loving all the new engine versions that have been released recently.

Thanks! I hope they aren't getting overwhelmed, though. I really appreciate the rather long time controls they use because that's a big blind spot in my own testing. This time I'm really curious where Leorik 2.5 will settle.

JoAnnP38 wrote: ↑Fri Sep 29, 2023 1:33 am Releasing a Leorik with multi-threaded search will definitely motivate me to finally address this short-coming in Pedantic.

You said you already have a lockless TT so you may find that the rest is easier than you thought. At least I was positively surprised: I already had a SearchInstance class that searches a position pretty self contained. Now I just spawn a bunch of them in parallel via Parallel.For and that's all. For me the main effort of that feature was to make the TT lockless.

JoAnnP38 wrote: ↑Fri Sep 29, 2023 1:33 am Currently, I am tearing my evaluation apart and rebuilding because it just needed to be cleaned up. Hopefully, I can garner a point or two of Elo from the expected performance enhancement.

Straight up performance optimization is the nicest way to get Elo, imo. If your code produces the same result but faster you know you've succeeded. It's as simple as that.

I'm sure Leorik has room for improvement performance-wise as well. But currently I'm all out of ideas.

Guenther · Post by **Guenther** » Fri Sep 29, 2023 12:15 pm

lithander wrote: ↑Fri Sep 29, 2023 3:00 am
Guenther wrote: ↑Thu Sep 28, 2023 11:52 am So if you have an old hardware with old Win7 (a likely combination) and Leorik immediately crashes your comp don't resign,
but do the above changes.
I've never heard of that trick! It's great to know that Leorik is still usable on older systems and even Windows 7! :)
Guenther wrote: ↑Thu Sep 28, 2023 11:52 am After this changes the Leorik (win classic = for old hardware) 2.5 binary worked normally here.
Code: Select all
info depth 19 score cp 27 nodes 19468381 nps 985840 time 19748 pv e2e4 e7e5 g1f3 g8f6 f3e5 d7d6 e5f3 f6e4 d2d4 d6d5 f1d3 f8e7 b1c3 f7f5 e1g1 e8g8 c1f4 b8c6 h2h3
Less than 1M nodes per second... I wonder if the the AVX2 based evaluation is to blame here. Do you still have Leorik 2.4 on that computer? If yes, I'd be curious to see how much 'nps' that version achieves on the same position.

Here it is!
...around 1.5Mn/s at depth 19, but time to depth is ofc better for the newer version.

Code: Select all

Leorik 2.4
...
position startpos
go infinite
...
info depth 19 score cp 38 nodes 35914077 nps 1506020 time 23847 pv e2e4 e7e5 g1f3 g8f6 d2d4 f6e4 f1d3 d7d5 f3e5 f8d6 e1g1 e8g8 c2c4 f8e8 c4d5 d6e5 d4e5 e8e5 d3e4

JoAnnP38 · Post by **JoAnnP38** » Fri Sep 29, 2023 2:00 pm

lithander wrote: ↑Fri Sep 29, 2023 3:00 am
JoAnnP38 wrote: ↑Fri Sep 29, 2023 1:33 am Congratulations! I bet Graham and Lars are loving all the new engine versions that have been released recently.
Thanks! I hope they aren't getting overwhelmed, though. I really appreciate the rather long time controls they use because that's a big blind spot in my own testing. This time I'm really curious where Leorik 2.5 will settle.

I have been trying (unsuccessfully so far) to test Pedantic for both blitz (2+1) and 40/15 (12+8); however, I haven't managed to replicate their secret sauce yet. My own ratings are generally lower than the CCRL ratings. For instance, I rated version 0.4.1 at 2819 Elo whereas CCRL rates it at 2886. The difference may be related to their using the bayeselo utility to determine Elo whereas I'm relying on the output from cutechess. What is interesting is that CCRL appears to rerun bayeselo weekly so ratings can change even when your engine doesn't play any new matches! I'm guessing that this is due to the elo for engines it has played in the past changing. I may try to start using bayeselo or perhaps ordo for calculating my elo for 0.5.0.

JoAnnP38 wrote: ↑Fri Sep 29, 2023 1:33 am Releasing a Leorik with multi-threaded search will definitely motivate me to finally address this short-coming in Pedantic.

lithander wrote: ↑Fri Sep 29, 2023 3:00 am You said you already have a lockless TT so you may find that the rest is easier than you thought. At least I was positively surprised: I already had a SearchInstance class that searches a position pretty self contained. Now I just spawn a bunch of them in parallel via Parallel.For and that's all. For me the main effort of that feature was to make the TT lockless.

I'm curious about how much overhead there is in spinning up a thread and whether or not using a thread pool that keeps all of the threads active would be significantly beneficial. Originally, I also tried to share the hash tables for both evaluation and pawn evaluations using the same technique as the main transposition table, however, on my first attempt to introduce MT several months ago my search throughput decreased. I think it was because I was also trying to share my TimeClock object across all threads which required some synchronization for some updates. However, at the time I had many more important problems to solve to I reversed all of those changes.

lithander · Post by **lithander** » Fri Sep 29, 2023 2:43 pm

Guenther wrote: ↑Fri Sep 29, 2023 12:15 pm Here it is!
...around 1.5Mn/s at depth 19, but time to depth is ofc better for the newer version.

Thanks!

Okay so the old version was 50% faster. The ratio looks about right: it was 40% on my computer between the two versions!

But in absolute values my computer appears to be 5x faster than yours when running Leorik.

Modern Times · Post by **Modern Times** » Sat Nov 04, 2023 9:19 am

JoAnnP38 wrote: ↑Fri Sep 29, 2023 2:00 pm I have been trying (unsuccessfully so far) to test Pedantic for both blitz (2+1) and 40/15 (12+8); however, I haven't managed to replicate their secret sauce yet. My own ratings are generally lower than the CCRL ratings. For instance, I rated version 0.4.1 at 2819 Elo whereas CCRL rates it at 2886. The difference may be related to their using the bayeselo utility to determine Elo whereas I'm relying on the output from cutechess.

I may be wrong but I think CuteChess uses Glicko-2. It is open source isn't it, so you should be able to check.

lithander · Post by **lithander** » Sat Dec 02, 2023 6:06 pm

Time flies by fast. It's been already 2 months since I've been published version 2.5, enough time to observe it play in the wild. Here's a quick summary:

Leorik has just won the 104th Amateur Series Division 7 with a comfortable lead!
On the CCRL 40/15 list Leorik is listed at 2900 Elo and 2940 Elo on the Blitz list. That's +110 Elo respective +80 Elo above the previous version.
On the CEGT list Leorik 2.5 is listed at 2764 Elo, +140 over the previous version.

The main feature of 2.5 was my experiment to compute Piece-Square values with linear functions at runtime based on the positions of both kings and the game phase. Afaik a unique evaluation approach and the first time I've utilized AVX2 hardware explicitly in any of my code. (The 256bit wide registers allow the engine to compute 8 multiply-and-add operations simultaneously in a single instruction, without this the idea would have been too slow to be viable) I'm pretty happy that this approach is working so well in practice!

And also the support for multi-threading didn't go unnoticed and seems to work as expected.

In the past weeks most of my development work has been spent in the https://github.com/lithander/Leorik/commits/bigdata branch. This is meant to prepare a future transition to a NNUE based evaluation. Employing neural networks is the one big idea in contemporary computer-chess that I'm still really curious about trying. Now you could argue that there are a lot of other established techniques left that Leorik doesn't have implemented. This is true, but some of them feel "oddly specific" to me, a lot of added complexity for a bit more Elo. And I have the feeling that I made a few decisions in the past that I would have to undo first to fully benefit from them: For example Leorik's search is much simpler in terms of LMR, extensions and reductions and so it seems to benefit less from spending effort on sorting the moves as good as possible.
But who knows, maybe some of the approaches I've took are even worth preserving? (At least a few programmers have mentioned that they enjoyed reading the code.) So I think I'll stay on the trajectory I have taken.

I'm not going to obsess over reaching 3000 Elo with HCE eval even though that was my hope originaly. There's going to be a final version 2.6 and it's going to be stronger than 2.5 but then the next big step forward in strength is going to be Leorik 3 featuring a NNUE eval trained exclusively on Leorik's selfplay games.

Devlog of Leorik

Re: Devlog of Leorik

New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: New Leorik Version 2.5!

Re: Devlog of Leorik