The last update to Leorik's devlog was almost a month ago, where I announced the release of version 2.1. Since then it was tested by the CCRL and is listed at 2583 Elo (+27) in the Blitz and 2606 Elo (+63) in the 40/15 rating list.
Leorik 2.1 has also played in
Division 7 of Graham's 94th Amateur Series and ended up on the 4th place half a point behind Blunder 7.6.
Before the next tournament starts I hope to have a version ready that can compete for the top spot. But I also know that the competition doesn't sleep. I'm especially scared about facing Odonata 0.6 but also version 8.0 of Blunder will probably be ready by then. And of course I'd like to claim the title of Leorik being the worlds strongest C# chess engine, which (to the best of my knowledge) is currently MadChess 3.0 but reading Eriks blog, I know that the real champion is MadChess 3.1 Beta!
So, long story short: It's the friendly competition that's providing me motivation to improve Leorik further and if I were to release version 2.2 now it would contain three changes:
- I disabled null-move pruning when the side to move has only pawns and the king left on the board.
- I added a new mobility term to the evaluation. (in addition to material & pawn structure)
- I have improved the replacement scheme of my transposition table which should improve it in longer time controls like they are used in tournaments or analysis.
Because my changes to the TT are documented in
this thread already the only feature worth discussing in more detail would be the mobility evaluation.
I started with adding a new type of feature for my tuner to consider alongside material: Each specific Piece/Move combination would get it's own dedicated feature in the vector. For example a queen that can move 23 squares is a different feature than a queen that can move 24 squares.
All pieces have between 0 and X legal moves. And I looked into my set of FENs to count what X is and learned that I need to add exactly 88 boolean features, total. Of course it's not really 'boolean' because you can have two pieces of the same type with the same move count on the board.
Running that through the tuner I could look at the 88 newly tuned weights and see something like this:
Code: Select all
Moves 0 1 2 3 4 5 6 7 8 9 10 11 12 13
======================================================================
Bishop = -27, -20, -12, -8, -2, 2, 5, 6, 8, 9, 12, 5, 12, -8
What that means is that for the final evaluation each Bishop's material value (that depends on square and phase) should be modified by a CP value from this table based on how mobile it is. If it can't move at all it loses 27 cp of it's worth. With 5 or more moves it gains a small bonus. If I use these modifiers verbatim I gain some good Elo already.
In the next step I did some curve-fitting to get rid of the noise from my too small dataset. I tried to fit the table to a segment on an arc defined by this formula:
Code: Select all
int Arc(int x, int width, int height)
{
//Looks like an inverted parabola with Arc(0) = 0 Arc(width) = 0 and Arc(width/2) = height
return height * 4 * (width * x - x * x) / (width * width);
}
And while I really like this formula in the end it was too complicated. I got pretty good results already by fitting a linear function to the tuned data, especially when I truncated it for high values of moves.
Code: Select all
static int BishopValue(int moves) => Min(7, moves) * 5;
static int RookValue(int moves) => Min(11, moves) * 5;
static int QueenValue(int moves) => Min(25, moves) * 3;
static int KingValue(int moves) => moves * -5;
What about Knights and Pawns you may ask? Wether I included or skipped the knight features didn't change the MSE of the tuned results much. In other words: The mobility of knights doesn't matter. Or so the tuner told me... (it's my oracle^^)
And pawns get a special treatment because their move table looked like this:
Code: Select all
Pawn: -10, 0, -1, 0, 74, 0, 0, 0, 0, 0, 0, 0, 0,
So the number of moves a pawn has doesn't matter much except that pawns with 0 moves are pretty bad (they are stuck) and pawns with 4 moves deserve a huge bonus. 4 moves? Well that's a pawn on the 7th rank with an empty target square ahead of him. So no curve fitting required. I just subtract 10 from the eval for each stuck pawn and add 74 for each pawn ready to promote. (And that's already a valuable pawn in the PSQTs, but the square infront of it being empty makes it extra valuable)
With a fast tuner you can just modify a value here and there and retune and see what the MSE does... It's like manual texel tuning! Or gambling...? In any case: It's quite fun!

So really no sophisticated tech here. I just use the tuner to guide me to coefficients that minimize the MSE after tuning and this seems to correlate well with the actual playing strength.
MSE(material) = 0.2460
MSE(material + pawns) = 0.2417
MSE(material + pawns + mobility) = 0.2364
...I didn't really bother yet with measuring how much better my dev version is now compared to 2.1 because I hope to first complete my work on the eval (for now) by adding something about king-safety. I don't know much more about it than that many engines have it in
their eval. So I will start again with coming up with a plethora of "features" for the tuner and then see which of them have the most significance in predicting the game outcome. Then I'll try to simplify again... it has worked for pawn structure and mobility and so I'm positive it will also work here.