Devlog of Leorik

eboatwright · Post by **eboatwright** » Wed Feb 21, 2024 5:09 pm

mvanthoor wrote: ↑Wed Feb 21, 2024 10:25 am
eboatwright wrote: ↑Wed Feb 21, 2024 12:39 am Bro that was the smoothest string of puns I've ever seen
That wasn't my intention... as all programmers know, leaving a String of puns to Float around in your codebase will create Real problems in a Class of their own. You need to Double down on those and address them in Short order, or they'll come back to Byte you in the ass.

Ok. I'll let myself out now. But you asked for it

I did ask for it with "String" hahahaha

lithander · Post by **lithander** » Mon Feb 26, 2024 5:14 pm

Leorik has now played a few long time control matches in Graham's Amateur Series and in a gauntlet and got listed on the CCRL 40/15 rating list with 3229 Elo.

I have downloaded the PGNs and evaluated them with the EAS Tool v 5.61:

Code: Select all

                                 bad  avg.win 
Rank  EAS-Score  sacs   shorts  draws  moves  
----------------------------------------------
   1    141397  23.08%  25.38%  22.82%   54   
----------------------------------------------

That seem to be pretty interesting stats! Short games and quiet willing to sacrifice material. Here's an example where Leorik wins against a stronger engine despite being 6 pawns down in the end.

[pgn][Event "CCRL 40/15"] [Site "CCRL"] [Date "2024.02.20"] [Round "921.7.139"] [White "Leorik 3.0 64-bit"] [Black "Laser 1.7 64-bit"] [Result "1-0"] [WhiteElo "3229"] [BlackElo "3255"] [ECO "B12"] [Opening "Caro-Kann"] [Variation "advance variation"] [PlyCount "116"] 1. d4 c6 2. e4 d5 3. e5 Bf5 4. Nf3 e6 5. Be2 Nd7 6. O-O h6 7. Be3 Qb6 8. b3 Qa5 9. c4 Ne7 10. Bd2 Qd8 11. Nc3 g5 12. cxd5 cxd5 13. h3 Nc6 14. Be3 Ba3 15. Bd3 Bb2 16. Bxf5 Bxa1 17. Nb5 exf5 18. Nd6+ Kf8 19. Qxa1 f4 20. Bc1 Nb6 21. Ba3 Kg7 22. Qb1 Qd7 23. Nh2 Qe6 24. Nf5+ Kg8 25. Rc1 Rh7 26. Bd6 a5 27. Re1 Kh8 28. a3 Nd7 29. b4 axb4 30. axb4 Qg6 31. Ng4 h5 32. Nf6 Nxf6 33. exf6 f3 34. g3 Rd8 35. Bc7 Rc8 36. Be5 Ra8 37. Rc1 Rg8 38. b5 Na5 39. Rc7 b6 40. Kh2 Nc4 41. Qd3 Nb2 42. Qc2 Nc4 43. Rd7 Na3 44. Qd3 Nxb5 45. Rxd5 Nc3 46. Rd7 b5 47. d5 Ne2 48. Qe4 Nf4 49. Re7 Nxh3 50. Qxf3 Nf4 51. d6 Qxf5 52. d7 Qh3+ 53. Kg1 Ne6 54. Qc6 Nd8 55. Qc8 Qe6 56. Bc7 Qxf6 57. Re8 Qa1+ 58. Kg2 Rg7 1-0[/pgn]

Mike Sherwin · Post by **Mike Sherwin** » Wed Mar 27, 2024 8:17 pm

I was out of commission for awhile because of an electrical storm last fall. When I got my older computer up and running I couldn't find my passwords. I tried to have my password sent to me. It claimed to have sent it to my email but it never arrived. So I gave up but I am back thanks to HGM.

So first of all congratulations! And I won't be trying to beat this version, lol.

Also I'm hoping I can convince you not to give up on your PSTBLs and going to NNUE quite yet. I'm hoping you can try real time learning to modify your PSTBLs first. Just play some amount of internal games and accumulate bonus and penalties to modify your PSTBLs in real time say using from half to 2/3rds of the allotted time. All pieces on square of the winning side will get a small bonus for every move and pieces on the losing side will get a small penalty. Draws produce a tiny penalty for both sides. Since each training game modifies the PSTBLs each training game tends to vary at some point. Alternately the positions of the training games can be loaded into the position hash and the score modified for each position of each game. This is what I did with after game learning in RomiChess. The subtree for the current position stored on the HD is loaded into the position hash before each search. One hundred stored games from any given position was enough to win against the strongest engines of the time. And the time limit for learning games mattered very little. So playing a hundred very fast games would bring very valuable information back from a hundred or more ply beyond the normal horizon of your search. I wish someone more capable than myself will give real time learning a try?

kelseyde123 · Post by **kelseyde123** » Mon Jul 08, 2024 7:27 pm

lithander wrote: ↑Tue Feb 06, 2024 10:41 am The architecture is (768->256)x2->1 which means that the network uses 768 inputs (2 colors x 6 piece-types x 64 squares) and one hidden layer of 256 neurons. It maintains two separate accumulators (from black's perspective and from white's perspective) and has two sets of output weights to allow the network to learn tempo.

Hello Lithander, sorry to dig up a comment of yours from months ago, but I've a question about your chosen NNUE architecture for Leorik.

If I understand correctly, you maintain two sets of output weights for the two feature sets, and then during inference you pass in the 'us' features and the 'them' features together (white/black or black/white, depending on the side-to-move).

My question is, since the 'them' features are just the flipped equivalent of the 'us' features, what extra benefit do you get by using both during inference, rather than just the 'us' features? You mention that it helps the network to learn tempo - how does that work exactly?

I'm currently trying to write my own NNUE trainer, and so far I've been working with a 768->256->1 architecture, passing in only the 'us' features each time. But I've seen so much written about the value of having the 'perspective' of both sides that I'm starting to think that I'm missing something fundamental.

Any insight would be much appreciated, thanks!

lithander · Post by **lithander** » Tue Jan 14, 2025 2:30 pm

It’s been a while since my last update. And Dan & Mike, I'm really sorry I didn't reply to your comments!

Over the past year, I went through some big personal changes, including splitting with my wife and adjusting to having my sons half the time. I wasn’t able to concentrate or relax enough to do any hobby programming during that time, and to be honest, I also didn’t see much point in it.

Now that life has settled down a bit, I’ve started toying with my chess engine again.

I’m training a new neural network from scratch. I have more powerful hardware by now and a lot more patience. When I implemented NNUE for Leorik 3.0 I used the large repository of labeled positions I had already accumulated for tuning the HCE and so even the first networks I trained where quite strong. But something about the journey of starting with no prior knowledge other than the basic rules of chess and a decent tree search algorithm and improving through selfplay is really fascinating.

I’m currently training a new neural network from scratch. With more powerful hardware and a lot more patience. When I implemented NNUE for Leorik 3.0, I used my large repository of labeled positions, originally accumulated for tuning the HCE, so even the first networks I trained were already quite strong.

But there’s something really fascinating about starting with no and no means to evaluate a position; no prior knowledge other than the basic rules of chess and a decent tree search algorithm, and then watching the engine improve purely through self-play.

algerbrex · Post by **algerbrex** » Wed Jan 15, 2025 5:27 pm

lithander wrote: ↑Tue Jan 14, 2025 2:30 pm It’s been a while since my last update. And Dan & Mike, I'm really sorry I didn't reply to your comments!

Over the past year, I went through some big personal changes, including splitting with my wife and adjusting to having my sons half the time. I wasn’t able to concentrate or relax enough to do any hobby programming during that time, and to be honest, I also didn’t see much point in it.

Now that life has settled down a bit, I’ve started toying with my chess engine again.

Glad to see you back Thomas, and good to hear you're doing well now. I've enjoyed using Leorik. Taking a break is definitely needed sometimes. I've only recently gotten back into to chess programming myself, after a two year break, with the creation of my new engine Eques.

lithander wrote: ↑Tue Jan 14, 2025 2:30 pm I’m training a new neural network from scratch. I have more powerful hardware by now and a lot more patience. When I implemented NNUE for Leorik 3.0 I used the large repository of labeled positions I had already accumulated for tuning the HCE and so even the first networks I trained where quite strong. But something about the journey of starting with no prior knowledge other than the basic rules of chess and a decent tree search algorithm and improving through selfplay is really fascinating.

I’m currently training a new neural network from scratch. With more powerful hardware and a lot more patience. When I implemented NNUE for Leorik 3.0, I used my large repository of labeled positions, originally accumulated for tuning the HCE, so even the first networks I trained were already quite strong.

But there’s something really fascinating about starting with no and no means to evaluate a position; no prior knowledge other than the basic rules of chess and a decent tree search algorithm, and then watching the engine improve purely through self-play.

I have the exact same sentiment. I find it really cool to be able to start from a totally blank slate and let the engine "discover" chess knowledge for itself. At the moment I stopped experimenting with neural networks, but I'm doing something similar for Eques's piece-square tables where I started with random, but reasonable, values. And through hundreds of thousands of self-play games, and gradient descent, build a solid set of piece-square tables. What I found especially interesting was even after just the first iteration of training on the random self-play games, the piece-square tables were already fairly descent.

I'll be interested to following along with this training process.

op12no2 · Post by **op12no2** » Thu Jan 16, 2025 2:32 pm

lithander wrote: ↑Tue Jan 14, 2025 2:30 pm But there’s something really fascinating about starting with no and no means to evaluate a position; no prior knowledge other than the basic rules of chess and a decent tree search algorithm, and then watching the engine improve purely through self-play.

Just out of interest, what method did you use to kickstart things? EGTB, random weight init etc...?

lithander · Post by **lithander** » Fri Jan 17, 2025 12:31 am

algerbrex wrote: ↑Wed Jan 15, 2025 5:27 pm Glad to see you back Thomas, and good to hear you're doing well now. I've enjoyed using Leorik. Taking a break is definitely needed sometimes. I've only recently gotten back into to chess programming myself, after a two year break, with the creation of my new engine Eques.

I'll be interested to following along with this training process.

Thanks Christian! Good to see you, too!

I will probably post a full report of the training process when I know it won't plateau below the net I trained for Leorik 3.0 and that might still take a while.

op12no2 wrote: ↑Thu Jan 16, 2025 2:32 pm Just out of interest, what method did you use to kickstart things? EGTB, random weight init etc...?

For my first batch of training data I had Leorik play random move sequences until a shallow search of 5000 nodes identifies a way to mate. Then all positions in that sequence get labeled WDL.

I didn't have to change any code to do this: Leorik had an UCI option "Temperature" to assign a random offset to all root moves that modifies the eval. With a temperature set to 100 and by loading a small network initialized with all zero's all positions receive the same eval=0 so the chosen move is 100% based on that random bonus, except when the shallow search returns a stalemate or mate score.

Here's the idea of Temperature explained: https://www.chessprogramming.org/Ronald_de_Man

op12no2 · Post by **op12no2** » Fri Jan 17, 2025 8:57 am

lithander wrote: ↑Fri Jan 17, 2025 12:31 am For my first batch of training data I had Leorik play random move sequences until a shallow search of 5000 nodes identifies a way to mate. Then all positions in that sequence get labeled WDL.

I didn't have to change any code to do this: Leorik had an UCI option "Temperature" to assign a random offset to all root moves that modifies the eval. With a temperature set to 100 and by loading a small network initialized with all zero's all positions receive the same eval=0 so the chosen move is 100% based on that random bonus, except when the shallow search returns a stalemate or mate score.

Here's the idea of Temperature explained: https://www.chessprogramming.org/Ronald_de_Man

Ah, interesting, thank you. I'm part way through a random-init-weights process. My existing net was "laundered" via some popular .epd files and it's never really 'felt right'; hard to explain; so like you I am enjoying this process.

lithander · Post by **lithander** » Fri Jan 17, 2025 2:10 pm

It's definitely possible to get off the ground without "external" chess knowledge like EGTB or opening books which is really cool.

It seems like it's not really important that the positions you train with are always correctly labeled and scored. (I completely ignore score) But what is important is that the errors are equally balanced. The first net trained on my first batch of training data was showing really odd behaviour and unfit of generating viable training data on it's own. So I kept the temperature high at +/-100cp. Now it was basically a random mover with a tendency towards what the net says. And *that* did the trick.

So from now on the question was: When is the net good enough that I should set the temperature to zero? And I haven't yet found a clear answer. So far the randomness in the training data still seems to do more good than harm.

Devlog of Leorik

Re: Devlog of Leorik - New version 3.0

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik - New version 3.0

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik