How do NNUEs self train?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

How do NNUEs self train?

Post by eboatwright »

Hello,

I've been trying to learn as much as I can about NNUE before starting to implement it into my engine
I know that some people train the neural network to "replicate" evaluations from the HCE (or other NNUEs), but I don't understand how an engine can "self-train"

I wouldn't think you can have the randomly initialized network create it's own pairs of (position, evaluation) training data, because it doesn't know anything yet, and just applying the game's result to the loss function seems unlikely, so how does it work?

~ Thanks in advance
Creator of Maxwell
syzygy
Posts: 5569
Joined: Tue Feb 28, 2012 11:56 pm

Re: How do NNUEs self train?

Post by syzygy »

eboatwright wrote: Wed Feb 14, 2024 3:15 am I've been trying to learn as much as I can about NNUE before starting to implement it into my engine
I know that some people train the neural network to "replicate" evaluations from the HCE (or other NNUEs), but I don't understand how an engine can "self-train"

I wouldn't think you can have the randomly initialized network create it's own pairs of (position, evaluation) training data, because it doesn't know anything yet, and just applying the game's result to the loss function seems unlikely, so how does it work?
It will learn from the outcome of the games. I guess it takes a while before a randomly initiated network starts to get a clue, though.
User avatar
hgm
Posts: 27855
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: How do NNUEs self train?

Post by hgm »

Even random movers have a better chance of accidentally checkmating the opponent when they have more and stronger material. So if you award their wins by upping the evaluation of the positions in that game of the side that won, on average these will contain a material advantage, and it will learn to value that. This will make it already play better, making the material advantage change less rapidly, so that it starts to correlate even better with the game result.
User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

Re: How do NNUEs self train?

Post by eboatwright »

Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
Creator of Maxwell
User avatar
lithander
Posts: 881
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: How do NNUEs self train?

Post by lithander »

Using a random network as evaluation function means you basically have no idea what a good position is. But the search still knows the rules of chess so only legal moves can be chosen. So after a sequence of basically random moves you stumble on position that have no legal moves. They are won for white or won for black or drawn. That's all you need to bootstrap the process.

0.) Start with a random network as your evaluation function
1.) Play thousands of selfplay matches and record them
2.) Create a list of labeled positions; each move in a match creates a position and the outcome of the game is the label
3.) Use millions of (position, label) data pairs to train a network that predicts the outcome (label) based on a position
4.) The resulting network should now do a little better as your evaluation function
5.) Go back to Step 1
eboatwright wrote: Wed Feb 14, 2024 10:36 pm Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
What you save is a bunch of PGNs or whatever format you chose to record millions of matches. Then filter that data and create millions of labeled positions in the format that your trainer expects. It's all pretty compute heavy but nothing a modern PC with a ~100GB of free disk space can't handle.
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

Re: How do NNUEs self train?

Post by eboatwright »

lithander wrote: Thu Feb 15, 2024 1:57 pm Using a random network as evaluation function means you basically have no idea what a good position is. But the search still knows the rules of chess so only legal moves can be chosen. So after a sequence of basically random moves you stumble on position that have no legal moves. They are won for white or won for black or drawn. That's all you need to bootstrap the process.

0.) Start with a random network as your evaluation function
1.) Play thousands of selfplay matches and record them
2.) Create a list of labeled positions; each move in a match creates a position and the outcome of the game is the label
3.) Use millions of (position, label) data pairs to train a network that predicts the outcome (label) based on a position
4.) The resulting network should now do a little better as your evaluation function
5.) Go back to Step 1
eboatwright wrote: Wed Feb 14, 2024 10:36 pm Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
What you save is a bunch of PGNs or whatever format you chose to record millions of matches. Then filter that data and create millions of labeled positions in the format that your trainer expects. It's all pretty compute heavy but nothing a modern PC with a ~100GB of free disk space can't handle.
Thank you so much!! That's awesome, I've been working on an implementation for a few days now: I've got a decently fast network that takes 768 (piece, square) inputs -> 256 -> 1, and basic gradient-descent implemented.

But with my basic implementation (calculating the delta for every weight individually) I calculated it would take over a month just to train one epoch of 5 mill positions! :? (I got ~19 mill (FEN, eval) pairs from a Lichess database for starting out)
So I'm currently stuck on trying to implement back-propagation
Creator of Maxwell
jdart
Posts: 4368
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: How do NNUEs self train?

Post by jdart »

This whole process is very computation-intensive. I have something like 100 cores over several machines to do the training data generation.

For model training I am using a fork of the Stockfish python trainer - that runs on the GPU. I have a RTX 3080 for that. That is reasonably fast
User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

Re: How do NNUEs self train?

Post by eboatwright »

jdart wrote: Thu Feb 15, 2024 4:57 pm This whole process is very computation-intensive. I have something like 100 cores over several machines to do the training data generation.

For model training I am using a fork of the Stockfish python trainer - that runs on the GPU. I have a RTX 3080 for that. That is reasonably fast
Yeah my computational power is definitely lacking, but I'm doing this to learn, so I'd like to write all the training code myself.
My lack of back-propagation is definitely a huge problem, still scratching my head on how to get that implemented, but we'll see hahaha
Creator of Maxwell
User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

Re: How do NNUEs self train?

Post by eboatwright »

Alright, so I've pushed what I have so far onto my dev branch:
https://github.com/eboatwright/Maxwell/ ... rc/nnue.rs

I'm gonna start by training a network from this data by Lichess: https://database.lichess.org/#evals
Although eventually I do want to fully self-train the final network :D
Creator of Maxwell
User avatar
eboatwright
Posts: 41
Joined: Tue Jan 09, 2024 8:38 pm
Full name: E Boatwright

Re: How do NNUEs self train?

Post by eboatwright »

Although after thinking it through some more, I think I might just end up learning PyTorch to train the network, all this derivative calculating and back-propagation is going waaaayyy over my head :mrgreen:
Creator of Maxwell