Is it okay to train a neural net evaluation using data generated from another engine not your own, such as Stockfish or Leela?
Is it okay to tune a piece square table evaluation using data generated from another engine not your own, such as Stockfish or Leela?
Training vs Tuning using data from other engines
Moderators: hgm, Dann Corbit, Harvey Williamson
-
Madeleine Birchfield
- Posts: 512
- Joined: Tue Sep 29, 2020 4:29 pm
- Location: Dublin, Ireland
- Full name: Madeleine Birchfield
-
Daniel Shawul
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Training vs Tuning using data from other engines
I don't know why you start a new thread for this when you can address the question directly in the other thread.Madeleine Birchfield wrote: ↑Wed Sep 30, 2020 8:12 pm Is it okay to train a neural net evaluation using data generated from another engine not your own, such as Stockfish or Leela?
Is it okay to tune a piece square table evaluation using data generated from another engine not your own, such as Stockfish or Leela?
This is not decided by a vote or mobilizing the mob.
It is simple really. In supervized learning, by defintion, you learn from a teacher (human, computer etc).
So it doesn't matter whether you train a network from it, tune piece-square-table from it etc. It is the same thing.
-
Madeleine Birchfield
- Posts: 512
- Joined: Tue Sep 29, 2020 4:29 pm
- Location: Dublin, Ireland
- Full name: Madeleine Birchfield
Re: Training vs Tuning using data from other engines
Quite differently, my original point and the original discussion was specifically about the ankan CUDA NN backend in comparison to the NNUE backend, which are two neural network architectures, that the standards applied for one should be applied to another.Daniel Shawul wrote: ↑Wed Sep 30, 2020 8:19 pmI don't know why you start a new thread for this when you can address the question directly in the other thread.Madeleine Birchfield wrote: ↑Wed Sep 30, 2020 8:12 pm Is it okay to train a neural net evaluation using data generated from another engine not your own, such as Stockfish or Leela?
Is it okay to tune a piece square table evaluation using data generated from another engine not your own, such as Stockfish or Leela?
This is not decided by a vote or mobilizing the mob.
It is simple really. In supervized learning, by defintion, you learn from a teacher (human, computer etc).
So it doesn't matter whether you train a network from it, tune piece-square-table from it etc. It is the same thing.
Whether the standards for training neural networks should also be applied to tuning piece-square tables I feel is a much wider and a different discussion, because it seems the computer chess community as a whole views tuning piece square tables and training neural nets differently.
-
Daniel Shawul
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Training vs Tuning using data from other engines
Even hand-crafted-evaluation is a neural network, a perceptron (one neuron), so you are basically training a neural netwok in all cases.
HCE including PSQT = perceptron
NNUE = 3-layer dense net
NN = Mutiple CNN layers
For hand-crafted eval you have to select the important features you want to evaluate yourself which is where the "art" comes.
For NNUE, there is some art in selecting the inputs to your network that could have a significant effect (because it is shallow net).
ResNet NN pretty much removes the "art" part of by automatic feature selection. Using attack tables as inputs could accelerate
learning compared to piece-locations but eventually you probably get same strength with naive inputs as well.
HCE including PSQT = perceptron
NNUE = 3-layer dense net
NN = Mutiple CNN layers
For hand-crafted eval you have to select the important features you want to evaluate yourself which is where the "art" comes.
For NNUE, there is some art in selecting the inputs to your network that could have a significant effect (because it is shallow net).
ResNet NN pretty much removes the "art" part of by automatic feature selection. Using attack tables as inputs could accelerate
learning compared to piece-locations but eventually you probably get same strength with naive inputs as well.