NNUE + Pawn-King Network

Discussion of chess software programming and technical issues.

Moderator: Ras

alvinypeng
Posts: 36
Joined: Thu Mar 03, 2022 7:29 am
Full name: Alvin Peng

NNUE + Pawn-King Network

Post by alvinypeng »

Most good evaluations take pawn-king structure into account. Since pawn-king structure isn't very dynamic, some clever people figured out you could cache pawn-king structure stuff, essentially allowing for a better evaluation with virtually no additional computational cost. With classical evaluation falling out of favor and NNUE taking over, pawn-king hash tables aren't used in the top engines anymore (with the exception of the sf hybrid eval).

While a good NNUE network would most certainly "understand" pawn-king structure, I wonder if it would be possible to combine an NNUE with a pawn-king specific net. To combine both networks to get a single scalar evaluation, just concatenate the output of the pawn-king network to the output of one of the hidden layers in the NNUE. Then, use that as the input for the next layer. A pawn-king table can be reintroduced to cache the output of the pawn-king network.

I made a diagram of how such a network could potentially be structured. The regular NNUE input features are on the left and are incrementally updated like normal. In the diagram, I have the features as HalfKA (11x64x64=45056) but it could really be any feature set. The pawn-king network is on the right. The input is 4 planes of dimensions 8x8 for the pawns and kings of the side-to-move and the side-not-to-move. The diagram only shows one convolutional layer with 32 filters, but ideally, there could be many more layers and filters. Because the output of the pawn-king net is cached, I would imagine it could afford to be somewhat expensive.

Any thoughts on this concept? Does it hold any merit or is it completely impractical?

Image
Sopel
Posts: 391
Joined: Tue Oct 08, 2019 11:39 pm
Full name: Tomasz Sobczyk

Re: NNUE + Pawn-King Network

Post by Sopel »

For Stockfish all typical implementations of additional pawn-king features have failed (like done in ethereal and many variations). Your idea of using a shallow convnet is perhaps the only thing that might work. I was thinking about something similar, with a 3-4 deep conv subnet attaching to the FT output. So something like this:

Image

It is on my radar to test.
dangi12012 wrote:No one wants to touch anything you have posted. That proves you now have negative reputations since everyone knows already you are a forum troll.

Maybe you copied your stockfish commits from someone else too?
I will look into that.
DrCliche
Posts: 65
Joined: Sun Aug 19, 2018 10:57 pm
Full name: Nickolas Reynolds

Re: NNUE + Pawn-King Network

Post by DrCliche »

Being able to cache part of the evaluation function obviously seems good in the abstract. In terms of general theory, you're simply imposing a form of inductive bias, which can be effective when a network is small enough to become saturated, or if the loss surface of your training procedure isn't terribly well-behaved. There's probably good reason to believe both are true in a typical chess NNUE.

On the other hand, whether any particular feature of a chess position is very good or very bad can change based on the smallest possible perturbation of a single piece, or the loss or gain of single tempo, so it's also quite plausible that networks evaluating subsets of pieces in isolation (e.g. a Pawn-King network) miss the forest for the trees and aren't worth the trouble.