Hello,
I've been trying to learn as much as I can about NNUE before starting to implement it into my engine
I know that some people train the neural network to "replicate" evaluations from the HCE (or other NNUEs), but I don't understand how an engine can "self-train"
I wouldn't think you can have the randomly initialized network create it's own pairs of (position, evaluation) training data, because it doesn't know anything yet, and just applying the game's result to the loss function seems unlikely, so how does it work?
~ Thanks in advance
How do NNUEs self train?
Moderators: hgm, Rebel, chrisw
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
How do NNUEs self train?
Creator of Maxwell
-
- Posts: 5569
- Joined: Tue Feb 28, 2012 11:56 pm
Re: How do NNUEs self train?
It will learn from the outcome of the games. I guess it takes a while before a randomly initiated network starts to get a clue, though.eboatwright wrote: ↑Wed Feb 14, 2024 3:15 am I've been trying to learn as much as I can about NNUE before starting to implement it into my engine
I know that some people train the neural network to "replicate" evaluations from the HCE (or other NNUEs), but I don't understand how an engine can "self-train"
I wouldn't think you can have the randomly initialized network create it's own pairs of (position, evaluation) training data, because it doesn't know anything yet, and just applying the game's result to the loss function seems unlikely, so how does it work?
-
- Posts: 27855
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: How do NNUEs self train?
Even random movers have a better chance of accidentally checkmating the opponent when they have more and stronger material. So if you award their wins by upping the evaluation of the positions in that game of the side that won, on average these will contain a material advantage, and it will learn to value that. This will make it already play better, making the material advantage change less rapidly, so that it starts to correlate even better with the game result.
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: How do NNUEs self train?
Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
Creator of Maxwell
-
- Posts: 881
- Joined: Sun Dec 27, 2020 2:40 am
- Location: Bremen, Germany
- Full name: Thomas Jahn
Re: How do NNUEs self train?
Using a random network as evaluation function means you basically have no idea what a good position is. But the search still knows the rules of chess so only legal moves can be chosen. So after a sequence of basically random moves you stumble on position that have no legal moves. They are won for white or won for black or drawn. That's all you need to bootstrap the process.
0.) Start with a random network as your evaluation function
1.) Play thousands of selfplay matches and record them
2.) Create a list of labeled positions; each move in a match creates a position and the outcome of the game is the label
3.) Use millions of (position, label) data pairs to train a network that predicts the outcome (label) based on a position
4.) The resulting network should now do a little better as your evaluation function
5.) Go back to Step 1
0.) Start with a random network as your evaluation function
1.) Play thousands of selfplay matches and record them
2.) Create a list of labeled positions; each move in a match creates a position and the outcome of the game is the label
3.) Use millions of (position, label) data pairs to train a network that predicts the outcome (label) based on a position
4.) The resulting network should now do a little better as your evaluation function
5.) Go back to Step 1
What you save is a bunch of PGNs or whatever format you chose to record millions of matches. Then filter that data and create millions of labeled positions in the format that your trainer expects. It's all pretty compute heavy but nothing a modern PC with a ~100GB of free disk space can't handle.eboatwright wrote: ↑Wed Feb 14, 2024 10:36 pm Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: How do NNUEs self train?
Thank you so much!! That's awesome, I've been working on an implementation for a few days now: I've got a decently fast network that takes 768 (piece, square) inputs -> 256 -> 1, and basic gradient-descent implemented.lithander wrote: ↑Thu Feb 15, 2024 1:57 pm Using a random network as evaluation function means you basically have no idea what a good position is. But the search still knows the rules of chess so only legal moves can be chosen. So after a sequence of basically random moves you stumble on position that have no legal moves. They are won for white or won for black or drawn. That's all you need to bootstrap the process.
0.) Start with a random network as your evaluation function
1.) Play thousands of selfplay matches and record them
2.) Create a list of labeled positions; each move in a match creates a position and the outcome of the game is the label
3.) Use millions of (position, label) data pairs to train a network that predicts the outcome (label) based on a position
4.) The resulting network should now do a little better as your evaluation function
5.) Go back to Step 1
What you save is a bunch of PGNs or whatever format you chose to record millions of matches. Then filter that data and create millions of labeled positions in the format that your trainer expects. It's all pretty compute heavy but nothing a modern PC with a ~100GB of free disk space can't handle.eboatwright wrote: ↑Wed Feb 14, 2024 10:36 pm Ohh so you do just apply the result of the game, that's interesting, should any extra data be saved between games, or just the tuned weights?
But with my basic implementation (calculating the delta for every weight individually) I calculated it would take over a month just to train one epoch of 5 mill positions! (I got ~19 mill (FEN, eval) pairs from a Lichess database for starting out)
So I'm currently stuck on trying to implement back-propagation
Creator of Maxwell
-
- Posts: 4368
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: How do NNUEs self train?
This whole process is very computation-intensive. I have something like 100 cores over several machines to do the training data generation.
For model training I am using a fork of the Stockfish python trainer - that runs on the GPU. I have a RTX 3080 for that. That is reasonably fast
For model training I am using a fork of the Stockfish python trainer - that runs on the GPU. I have a RTX 3080 for that. That is reasonably fast
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: How do NNUEs self train?
Yeah my computational power is definitely lacking, but I'm doing this to learn, so I'd like to write all the training code myself.jdart wrote: ↑Thu Feb 15, 2024 4:57 pm This whole process is very computation-intensive. I have something like 100 cores over several machines to do the training data generation.
For model training I am using a fork of the Stockfish python trainer - that runs on the GPU. I have a RTX 3080 for that. That is reasonably fast
My lack of back-propagation is definitely a huge problem, still scratching my head on how to get that implemented, but we'll see hahaha
Creator of Maxwell
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: How do NNUEs self train?
Alright, so I've pushed what I have so far onto my dev branch:
https://github.com/eboatwright/Maxwell/ ... rc/nnue.rs
I'm gonna start by training a network from this data by Lichess: https://database.lichess.org/#evals
Although eventually I do want to fully self-train the final network
https://github.com/eboatwright/Maxwell/ ... rc/nnue.rs
I'm gonna start by training a network from this data by Lichess: https://database.lichess.org/#evals
Although eventually I do want to fully self-train the final network
Creator of Maxwell
-
- Posts: 41
- Joined: Tue Jan 09, 2024 8:38 pm
- Full name: E Boatwright
Re: How do NNUEs self train?
Although after thinking it through some more, I think I might just end up learning PyTorch to train the network, all this derivative calculating and back-propagation is going waaaayyy over my head
Creator of Maxwell