Questions on designing a neural net evaluation function

Discussion of chess software programming and technical issues.

Moderator: Ras

jmcd
Posts: 58
Joined: Wed Mar 18, 2020 10:00 pm
Full name: Jonathan McDermid

Questions on designing a neural net evaluation function

Post by jmcd »

I haven't done any work with my C++ chess engine in a couple years but was recently compelled to return to it after working with neural nets. I thought using one to replace the rudimentary evaluation function I had previously made would probably be the easiest and most effective way to implement one in a chess engine. After doing some reading, I've learned that this is certainly not a unique idea and many people are doing it. I've got a lot of questions and I don't expect them all to be answered, but I'd just like to know if I am on the right track or if some of what I am planning is a fools errand.

1. My initial idea was to use the important information from a FEN string as the inputs for the board. So 64*12 inputs for each type of piece, 4 inputs castling rights for each side, 1 input for the turn, and 8 for EP flags. After doing some tests and training a net for these inputs, it still appears dumb as a rock and I had a feeling that an engine would learn better with more inputs. I had thought maybe using inputs for threatened squares would help but from what I've seen that's not really a common solution. Is it a bad idea to use a small number of inputs like this, and should I instead go for a large number of inputs like the K x P x Sq method that NNUE uses?

2. Regarding NNUE, I am a bit puzzled at how its inputs work. It uses 64 x (64 x 10 + 1) inputs per side, with the 1 for "shogi piece drop". What is shogi piece drop? Is it integral to the evaluation function or would it still be effective without it?

3. Regarding NNUE, how do the inputs change for whose turn it is to move? The chessprogramming page shows one side of inputs for black and one for white and as I understand it, inputs are adjusted when moves are made and unmade, allowing a few number of neurons to be changed each time a new position is reached. What I don't understand is how this is adjusted for the change in turn. If there are 82408 inputs in total in an array, and the first half are for white and the second half are for black, how is it adjusting those inputs for whose turn it is to move? My guess is a XOR with half the number of total inputs, but I'd like to have this verified.

4. Are there public databases for training data? I've already written my own programs to get positions and evaluations and save them, but I'm assuming this is a problem that has already been solved and I could just be skipping this step.

5. Is it reasonable to think that I could design a neural net evaluation function that is stronger than a standard one that uses bitboards, simple bonuses and static piece values?

Thanks for reading
Clovis GitHub
yeni_sekme
Posts: 40
Joined: Mon Mar 01, 2021 7:51 pm
Location: İstanbul, Turkey
Full name: Ömer Faruk Tutkun

Re: Questions on designing a neural net evaluation function

Post by yeni_sekme »

This is a great source for NNUE:
https://github.com/glinscott/nnue-pytor ... cumulators

1-Ep and castling shouldn't matter much, they will just slow down the calculation . Instead of giving side to move directly as one input ,always looking at the side to move perspective should be much better. This means you need to mirror and change color when black is to move
2-This is for old NNUEs it is removed now.
3-
4- https://drive.google.com/drive/u/0/fold ... QIrgKJsFpl

There is a convert command in Stockfish to convert this .binpack to .plain ( human-readable format )

5- I trained a small network compared to Stockfish with 500 million positions from lcfish data , although my engine is badly written and lacks of many features it performs ~2500 ELO thanks to NNUE. If your classical evaluation is so simple, expecting 400-500 elo from NNUE is reasonable.