I believe that at the time of first adding them (which was also the first time I tried automatic parameter tuning) it was around 50 Elo, but I am not 100% sure about that
In C-like syntax it could look like this (again, excluding phases): pst[2][64][7][64]
The [2] is not exactly the color of the king, but it's one sub-table for the enemy king and one for our own king. The idea behind this is that the one table encodes how to position your pieces for attacking the enemy king, and one table for how to defend your own king. The first [64] is then the square of that king.
The [7] are the pieces pawn, knight, bishop, rook, queen, king, and passed pawn and the [64] then the square of that piece.
I have three eval features for king safety (besides the king contextual PSTs).algerbrex wrote: ↑Fri Jul 15, 2022 3:03 am Out of curiosity, do you also texelly tune king safety as well? What model did you end up using that was relatively easily derivable. I ended up settling on a quadratic model for a first attempt, since the derivative of simple of course, and it did surprisingly well. Next up is to find an exponential model to better capture the idea of king-safety.
- bonus for bishop, queen, rook checking the enemy king
- bonus for bishop, queen, rook attacking the immediate area around enemy king (3x3)
- penalty for each square from which a queen would check the king when only considering our own pawn structure
- number of enemy pieces near our king (5x5 area). This turned out to be either a penalty for more pieces (during endgame) or unintuitively a bonus (during opening).
For the last two, I have simply tables for each number, e.g. when there are 3 pieces near the king, then I look into the table for this feature at index 3. And this way, I then also calculate a simple linear gradient for the respective table entry.
I am actually not 100% sure if the 2nd and the 4th point are doing anything good for my evaluation, I think the last time i tested them was together with another bunch of new eval features, and it worked, so I just went with it.
I don't have any non-linear king safety features.
Do you calculate the gradient over all positions each iteration? Because I do that, and I only do 200-400 iterations (though it takes also 3-6 hours with (inefficiently used) 30 threads).algerbrex wrote: ↑Fri Jul 15, 2022 3:03 am Interestingly, I found with my gradient descent tuner I got the best results switching to AdaGrad, and then running several thousand iterations. The current evaluation parameters in Blunder were tuned from scratch using 50K iterations and 1M positions from my extended dataset. Took about four hours, much better than the eleven it took to tune the original evaluation parameters in Blunder 7.6.0, on only 400K positions.