This might help you understand NNUE better

Graham Banks · Post by **Graham Banks** » Mon Jul 31, 2023 9:13 am

[Peter - Texel] I mean on the lowest level it is easy to understand (for someone skilled in the art) what is happening (matrix multiplication, etc), but it is difficult to understand why the end result is so good. The NN could have learned chess concepts that are unknown to humans but which humans could understand if we had a way to extract the concepts from the NN.
A lot of people are trying to "understand" NNs though and I think science will make progress in this area in the future.

A high-level explanation of NNs: You "train" them by giving them a large amount of training examples. For chess that might be 1 billion positions and corresponding evaluation scores based on a fairly short search. The NN then "magically" learns to generalize from the training examples, so that it will learn things like "queen better than rook", "double pawns bad", and a thousand other things, some of which might be very complicated.

[Ciekce - Stormphrax] "fairly short" = 5k nodes or depth 7 or 8 in most engines

[Ciekce - Stormphrax] neural network evaluation essentially involves activating certain neurons in the input layer (which just means that they have their actual values, rather than zero)
And then you go through for each layer and multiply the previous layer's outputs by this layer's weights and add this layer's biases and eventually arrive at a final value.

[Peter - Texel] Graham, in general a bigger net is better if you have enough training data for it. It will be slower, but good evaluation can be worth more than a deeper search. As an example consider the rook pawn and wrong bishop endgame. If your evaluation does not understand it you might need to search 50 ply to see the truth.

[Ciekce - Stormphrax] NNUE - it's Efficiently Updated Neural Network, but the acronym is backwards

[Yinuo - Avalanche] Or as the dragon authors like to say, "neural networks updated efficiently"

[Ciekce - Stormphrax] the UE part is the magic of nnue

[Peter - Texel] Basically it just speeds up the evaluation, so still produces the exact same evaluation. When I implemented the "UE" part in my engine it got around 45% faster in the initial position, but less speed up later in the game.

[Peter - Texel] I got like 200 elo from the NN part and 30 elo from the UE part. There might be more I can do to optimize the UE part, but it seems clear to me that the NN is much more important than the UE.

[Peter - Texel] The success of NNUE might in large have been because of the net architecture and not from the UE.

[Ciekce - Stormphrax] The architecture refers to the layers and their sizes, essentially. The actual weights and biases are the network, rather than the architecture.

Ciekce · Post by **Ciekce** » Mon Jul 31, 2023 10:04 am

it should be noted that this conversation was a high level explanation in response to a few questions by Graham, and not intended as a full overview - hence simplifications like the omission of activation functions :)

Graham Banks · Post by **Graham Banks** » Mon Jul 31, 2023 10:53 am

Ciekce wrote: ↑Mon Jul 31, 2023 10:04 am it should be noted that this conversation was a high level explanation in response to a few questions by Graham, and not intended as a full overview - hence simplifications like the omission of activation functions

It was like, 'NNUE for Dummies'.

This might help you understand NNUE better

This might help you understand NNUE better

Re: This might help you understand NNUE better

Re: This might help you understand NNUE better