Devlog of Leorik

zenpawn · Post by **zenpawn** » Thu Jun 30, 2022 12:57 pm

lithander wrote: ↑Tue Jun 28, 2022 12:35 amAnd of course I'd like to claim the title of Leorik being the worlds strongest C# chess engine, which (to the best of my knowledge) is currently MadChess 3.0 but reading Eriks blog, I know that the real champion is MadChess 3.1 Beta!

RookieMonster is also written in C#, but I agree MadChess 3.1 will probably jump ahead again.

lithander · Post by **lithander** » Thu Jun 30, 2022 1:18 pm

zenpawn wrote: ↑Thu Jun 30, 2022 12:57 pm RookieMonster is also written in C#, but I agree MadChess 3.1 will probably jump ahead again.

Ah I didn't know that! Thanks for the info! Can I download your engine somewhere to include in my gauntlet?

emadsen · Post by **emadsen** » Fri Jul 01, 2022 1:50 am

lithander wrote: ↑Tue Jun 28, 2022 12:35 am And of course I'd like to claim the title of Leorik being the worlds strongest C# chess engine, which (to the best of my knowledge) is currently MadChess 3.0 but reading Eriks blog, I know that the real champion is MadChess 3.1 Beta!

Leorik has been gaining strength quickly. Seems like it will pass MadChess soon. Especially because of my slow rate of progress and less frequent releases.

lithander wrote: ↑Tue Jun 28, 2022 12:35 am What about Knights and Pawns you may ask? Wether I included or skipped the knight features didn't change the MSE of the tuned results much. In other words: The mobility of knights doesn't matter. Or so the tuner told me... (it's my oracle^^)

My tuner found the same. MadChess awards no bonus for knight mobility in the middlegame and only -13 to +9 centipawns in the endgame. Tapered of course.

lithander wrote: ↑Tue Jun 28, 2022 12:35 am I hope to first complete my work on the eval (for now) by adding something about king-safety... So I will start again with coming up with a plethora of "features" for the tuner and then see which of them have the most significance in predicting the game outcome.

Good luck!

zenpawn · Post by **zenpawn** » Fri Jul 01, 2022 12:03 pm

emadsen wrote: ↑Fri Jul 01, 2022 1:50 am
lithander wrote: ↑Tue Jun 28, 2022 12:35 am And of course I'd like to claim the title of Leorik being the worlds strongest C# chess engine, which (to the best of my knowledge) is currently MadChess 3.0 but reading Eriks blog, I know that the real champion is MadChess 3.1 Beta!
Leorik has been gaining strength quickly. Seems like it will pass MadChess soon. Especially because of my slow rate of progress and less frequent releases.

I agree with that too; he'll pass us both, and that'll be that.

zenpawn · Post by **zenpawn** » Sat Jul 02, 2022 1:25 am

lithander wrote: ↑Thu Jun 30, 2022 1:18 pm
zenpawn wrote: ↑Thu Jun 30, 2022 12:57 pm RookieMonster is also written in C#, but I agree MadChess 3.1 will probably jump ahead again.
Ah I didn't know that! Thanks for the info! Can I download your engine somewhere to include in my gauntlet?

Sorry, RM still plays too many exasperating blunders for me to release it. One time (when it was rated around "2500"), it managed to lose an opposite-colored bishops endgame that I was convinced I could have held. Sure enough, I played a blitz game against Stockfish from the start of that OCB position and easily got the draw.

lithander · Post by **lithander** » Tue Jul 05, 2022 12:23 am

lithander wrote: ↑Tue Jun 28, 2022 12:35 am So I will start again with coming up with a plethora of "features" for the tuner and then see which of them have the most significance in predicting the game outcome. Then I'll try to simplify again... it has worked for pawn structure and mobility and so I'm positive it will also work here.

And of course my optimism was misplaced.

I have tried many different little features (things that should offer protection like pawns infront of the king, or on surrounding squares. But also things that provide a threat like pieces attacking squares near the king...) and some of them lower the MSE quite significantly. In the past a smaller MSE always correlated with a gain in playing strength. Not this time, though. Whenever I try to include something king-safety related in the engine the new build loses in selfplay against the previous version.

It's almost like the features I come up with and are indicative of a winning position (lower MSE) can also be created in a way that does not at all mean you're winning. The engine is just happy to create positions that look like they attack the King but it's not really creating any executable plans.

algerbrex · Post by **algerbrex** » Tue Jul 05, 2022 1:02 am

lithander wrote: ↑Tue Jul 05, 2022 12:23 am
lithander wrote: ↑Tue Jun 28, 2022 12:35 am So I will start again with coming up with a plethora of "features" for the tuner and then see which of them have the most significance in predicting the game outcome. Then I'll try to simplify again... it has worked for pawn structure and mobility and so I'm positive it will also work here.
And of course my optimism was misplaced.

I have tried many different little features (things that should offer protection like pawns infront of the king, or on surrounding squares. But also things that provide a threat like pieces attacking squares near the king...) and some of them lower the MSE quite significantly. In the past a smaller MSE always correlated with a gain in playing strength. Not this time, though. Whenever I try to include something king-safety related in the engine the new build loses in selfplay against the previous version.

It's almost like the features I come up with and are indicative of a winning position (lower MSE) can also be created in a way that does not at all mean you're winning. The engine is just happy to create positions that look like they attack the King but it's not really creating any executable plans.

Correct me if I'm wrong, but it sounds like you're trying to implement king safety in a similar manner to your other evaluation terms, by collecting certain features of the position, and multiplying these by the correct phase and certain weights? If so, I could see that as being an issue because I'm not sure how you could include any notion of non-linearity into the king safety score.

You might remember from our discussion a couple of weeks ago that what I had to do for Blunder was introduce a little extra math besides the dot product to allow the king safety to be non-linear. I used a scaled-down quadratic model and ran the "raw" king safety score through that to get a final king safety evaluation score to add to the normal evaluation (I need to get around to finding the paper too so I can upload it to the repo, thanks for the reminder!).

I'm not aware of any engines that have been able to use a linear king safety score, since a really important idea behind king safety is that building up an attack has to be something that's gradual. A lone queen hovering around a castled king isn't really a big deal, nor are a couple of majors or minors pointing at the king, nor are a couple of the castled-side pawns being pushed forward a huge deal. But combine all of that together and your king is in serious trouble! And from my estimation the best way to capture this idea of a gradual attack is through using a non-linear king safety score, however, that might be accomplished.

So the issue you might be running into right now is that though the features you're adding are correlated with successful king attacks, Leorik doesn't understand how to get to the point because the king safety evaluation is not granular enough due to linearity.

lithander · Post by **lithander** » Tue Jul 05, 2022 12:49 pm

Here's the source code of one of my attempts to do add a king-safety related term. Basically it counts how many squares around the king are threatened by enemy pieces. If the same square is attacked by multiple pieces the threatcounter increases each time so the count can be bigger than the amount of squares around the king. Then you modify the evaluation based on that counter and a lookup how that should affect the evaluation from the two arrays. One adjusts the base score (midgame) the other governs the modification of that base score as the game transitions into endgame.

Code: Select all

        static short[] KingThreatsBase = new short[20] { -56, -50, -48, -51, -47, -36, -27, 1, 8, 37, 81, 42, 51, 91, 4, 0, 0, 0, 0, 0, };
        static short[] KingThreatsEndgame = new short[20] { 45, 39, 23, 37, 34, 15, 14, -23, -29, -66, -94, -34, 22, 15, 0, 0, 0, 0, 0, 0, };

        public static void Update(BoardState board, ref EvalTerm eval)
        {
            //White
            int count = Features.CountBlackKingThreats(board);
            eval.Base += KingThreatsBase[count];
            eval.Endgame += KingThreatsEndgame[count];
            //Black
            count = Features.CountWhiteKingThreats(board);
            eval.Base -= KingThreatsBase[count];
            eval.Endgame -= KingThreatsEndgame[count];
        }

The values in these arrays are tuned by extending the feature-vector so that each position will set a component to one that is exclusively associated with a specific threatcount on a specific king. So each position will set two components - one for the black king and one for the white king.

In the middle game it seems that a threatcount of 7 or more on the opposing king starts to be a sign for a winning position and less starts to be contributing to a losing position. Each slot in the array represents a few percent of positions and can be tuned linearly. It's resulting in a lookup table for a function mapping the threat-count to a value and this function is not constrained to be linear as far as I can see.

...but on the other hand it doesn't work. So I'm sure I'm missing something important. Looking forward to your paper!

Mike Sherwin · Post by **Mike Sherwin** » Tue Jul 05, 2022 4:30 pm

There is something very simple that can be tried. For each root move count the checkmates for and against. Do some math on those numbers. Adjust that root moves score.

algerbrex · Post by **algerbrex** » Tue Jul 05, 2022 4:44 pm

Mike Sherwin wrote: ↑Tue Jul 05, 2022 4:30 pm There is something very simple that can be tried. For each root move count the checkmates for and against. Do some math on those numbers. Adjust that root moves score.

I remember you mentioning this idea Before, and it's something I meant to try in Blunder as well, so thanks for reminding me! I think I remember adding it in before and it seem promising, but I never got the chance to run a full, rigorous test.

Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik

Re: Devlog of Leorik