Hi guys
I'm trying to embed NNUE from CFish by Robert De Man to my engine BBC.
Please don't hate me for that.
Assuming how noob I am I can hardly believe I would ever succeed in this.
Andy Grant ones said that it's the matter of several hours to embed NNUE to your engine,
well, for me it's probably a matter of several lives...
Anyway, even if I won't success in embedding it at least I want to learn how to apply it.
I'm staring at oldnnue.c https://github.com/syzygy1/Cfish/blob/m ... /oldnnue.c
What I've realized so far (sorry for a dumb level of understanding)
1. is that it's using current position in order to get appropriate weight from weights table
2. It uses some processor specific instructions for optimizing performance if possible and dummy calculation otherwise
3. It uses CFish specific types for pieces/color etc.
The current implementation is TOO COMPLICATED for my understanding.
I would like to simplify it the following way:
1. Load weights
2, Return eval for current position (I have global array of bitboards to represent board position)
Do it without fancy processor command optimization and literally drop off everything it could work without, obtain a bare bare minimum implementation, no matter if it would be slow. And then I just want to test via setting position, calling evaluate() and retrieving score like in handcrafted eval.
Can I achieve this in some other away but to born in new body with new consciousness and spending years studying math in university (you can't even imagine how bad I am in math)?
May be some simplified implementation of NNUE exists?
Or at least some implementation that is engine agnostic?
I mean stand alone NNUE implementation so user can send a position as input and retrieve score as output.
Thanks in advance.
Hacking around CFish NNUE
Moderators: hgm, Rebel, chrisw
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Hacking around CFish NNUE
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Hacking around CFish NNUE
Well ok, no reply is also an answer...
What I've managed to achieve so far:
1. Compiled nnue.c separately
2. Initialized weights from file (well at least think so...)
Now in order to call nnue_evaluate(Position *pos) the only thing I have a lack of is position object.
First I was trying to initialize it from FEN, but getting segmentation fault all the time...
But then I've realized that probably there might not be a need of placing pieces on board
because the only fields of position object used in nnue.c are dirtyPiece and accumulator
So let me narrow my question from why life is unfair to code monkeys to the following:
1. Is anyone aware of what are dirtyPiece and accumulator
2. HOW can I initialize them?
What I've managed to achieve so far:
1. Compiled nnue.c separately
2. Initialized weights from file (well at least think so...)
Now in order to call nnue_evaluate(Position *pos) the only thing I have a lack of is position object.
First I was trying to initialize it from FEN, but getting segmentation fault all the time...
But then I've realized that probably there might not be a need of placing pieces on board
because the only fields of position object used in nnue.c are dirtyPiece and accumulator
So let me narrow my question from why life is unfair to code monkeys to the following:
1. Is anyone aware of what are dirtyPiece and accumulator
2. HOW can I initialize them?
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 28010
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Hacking around CFish NNUE
Why are you bothering with code written by others? Without fancy CPU optimizations NNUE is pretty trivial, right? You just need 2*64*256 piece-square tables, 256 for each location of the white King, and 256 for each location of the black King. The 2*256 PST sums for the current King position are recalculated from scratch when you move a King, or incrementally updated when you move another piece. You then multiply each of these 512 values by a weight, add them and set the result to zero if it was negative, and do that 32 times (each time with a different set of weights). With the 32 results you repeat the multiply - sum - clip 32 times, to get again 32 results. These you just mutiply and add (no clipping), to get the evaluation score.
I am sure writing code like
is not really a challenge for anyone.
See https://www.chessprogramming.org/Stockfish_NNUE .
I am sure writing code like
Code: Select all
for(i=0; i<32; i++) {
int sum = 0;
for(j=0; j<512; j++) {
sum += weights1[i][j] * layer1[j]
}
layer2[i] = max(0, sum);
}
See https://www.chessprogramming.org/Stockfish_NNUE .
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Hacking around CFish NNUE
Accumulator is something NNUE takes care of for you. You just have to allocate space for it on the stack.maksimKorzh wrote: ↑Thu Oct 15, 2020 4:38 pm Well ok, no reply is also an answer...
What I've managed to achieve so far:
1. Compiled nnue.c separately
2. Initialized weights from file (well at least think so...)
Now in order to call nnue_evaluate(Position *pos) the only thing I have a lack of is position object.
First I was trying to initialize it from FEN, but getting segmentation fault all the time...
But then I've realized that probably there might not be a need of placing pieces on board
because the only fields of position object used in nnue.c are dirtyPiece and accumulator
So let me narrow my question from why life is unfair to code monkeys to the following:
1. Is anyone aware of what are dirtyPiece and accumulator
2. HOW can I initialize them?
DirtyPiece is only needed for incremental updating.
Maybe you can try first to do without incremental evaluation (i.e. no need to update dirtyPiece or accumulator in make).
It is only 14-18% slower. To disable incremental update comment out this line:
https://github.com/syzygy1/Cfish/blob/m ... ue.c#L1025
Then you don't have to worry about updating accumulator or dirtyPiece in make move.
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Hacking around CFish NNUE
re: Without fancy CPU optimizations NNUE is pretty trivial, right?hgm wrote: ↑Thu Oct 15, 2020 6:04 pm Why are you bothering with code written by others? Without fancy CPU optimizations NNUE is pretty trivial, right? You just need 2*64*256 piece-square tables, 256 for each location of the white King, and 256 for each location of the black King. The 2*256 PST sums for the current King position are recalculated from scratch when you move a King, or incrementally updated when you move another piece. You then multiply each of these 512 values by a weight, add them and set the result to zero if it was negative, and do that 32 times (each time with a different set of weights). With the 32 results you repeat the multiply - sum - clip 32 times, to get again 32 results. These you just mutiply and add (no clipping), to get the evaluation score.
I am sure writing code like
is not really a challenge for anyone.Code: Select all
for(i=0; i<32; i++) { int sum = 0; for(j=0; j<512; j++) { sum += weights1[i][j] * layer1[j] } layer2[i] = max(0, sum); }
See https://www.chessprogramming.org/Stockfish_NNUE .
- not to me unfortunately. I feel your explanation is brilliant, but still a rocket science to me
Can you please clarify the code:
Code: Select all
//
for(i=0; i<32; i++) {
int sum = 0;
for(j=0; j<512; j++) {
sum += weights1[i][j] * layer1[j]
}
layer2[i] = max(0, sum);
}
1. How can I initialize weights1?
2. weights1 is 2 dimensional array here, what values I need in 1st and 2nd indices when I define array? // e.g. weights1[?][?]
3. same question for layer1 amd and layer2 (I only understand that NNUE has 4 layers but that's rocket science to me)
Could you please provide the code the would be doing following(or give a link on implementation):
1. Init everything needed from "*.nnue" file with weights
2. then I guess the code you've already provided
3. And then somehow magically obtain a score
I would greatly appreciate the each line comment like you did in microMax
P.S. I mean really - I'm too dumb and my mind is collapsing. I swear understanding the move generator of microMax and implementing it on my own was a piece of cake (I followed your webstite tutorial) compared to this rocket science. HGM, if you can, PLEASE just give me commented code so I could see WHAT to input (how on earth to input board position) and get say, 0.20 score after d4 made in initial position. Sorry, but I can't learn from explanations, literally going insane, but I understand when every line is code is commented like in your microMax. Btw this is the reason to dig in someones code - that's the only way I can learn. All this rocket science CPW explanations are not an option for idiots.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Hacking around CFish NNUE
Thanks for your advice. I would sacrifice anything just to make output a score from a given FEN...Daniel Shawul wrote: ↑Thu Oct 15, 2020 6:56 pmAccumulator is something NNUE takes care of for you. You just have to allocate space for it on the stack.maksimKorzh wrote: ↑Thu Oct 15, 2020 4:38 pm Well ok, no reply is also an answer...
What I've managed to achieve so far:
1. Compiled nnue.c separately
2. Initialized weights from file (well at least think so...)
Now in order to call nnue_evaluate(Position *pos) the only thing I have a lack of is position object.
First I was trying to initialize it from FEN, but getting segmentation fault all the time...
But then I've realized that probably there might not be a need of placing pieces on board
because the only fields of position object used in nnue.c are dirtyPiece and accumulator
So let me narrow my question from why life is unfair to code monkeys to the following:
1. Is anyone aware of what are dirtyPiece and accumulator
2. HOW can I initialize them?
DirtyPiece is only needed for incremental updating.
Maybe you can try first to do without incremental evaluation (i.e. no need to update dirtyPiece or accumulator in make).
It is only 14-18% slower. To disable incremental update comment out this line:
https://github.com/syzygy1/Cfish/blob/m ... ue.c#L1025
Then you don't have to worry about updating accumulator or dirtyPiece in make move.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Hacking around CFish NNUE
I wonder why auto-vectorization is not used instead of the manual SIMD code NNUE currently has. There is separate code for AVX2, SSE3,SSE2,SSE etc which is kind of ugly. Your code above can be easily auto-vectorized by the compiler, so I wonder why this approach is not taken. I don't see any operation preventing auto-vectorization in a simple dense network. The NNUE code either doesn't have easily vectorizable "default code" or compilers do a really bad job at it as it seems it is 3x slower without vectorization.hgm wrote: ↑Thu Oct 15, 2020 6:04 pm Why are you bothering with code written by others? Without fancy CPU optimizations NNUE is pretty trivial, right? You just need 2*64*256 piece-square tables, 256 for each location of the white King, and 256 for each location of the black King. The 2*256 PST sums for the current King position are recalculated from scratch when you move a King, or incrementally updated when you move another piece. You then multiply each of these 512 values by a weight, add them and set the result to zero if it was negative, and do that 32 times (each time with a different set of weights). With the 32 results you repeat the multiply - sum - clip 32 times, to get again 32 results. These you just mutiply and add (no clipping), to get the evaluation score.
I am sure writing code like
is not really a challenge for anyone.Code: Select all
for(i=0; i<32; i++) { int sum = 0; for(j=0; j<512; j++) { sum += weights1[i][j] * layer1[j] } layer2[i] = max(0, sum); }
See https://www.chessprogramming.org/Stockfish_NNUE .
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Hacking around CFish NNUE
Hold on a sec...Daniel Shawul wrote: ↑Thu Oct 15, 2020 6:56 pmAccumulator is something NNUE takes care of for you. You just have to allocate space for it on the stack.maksimKorzh wrote: ↑Thu Oct 15, 2020 4:38 pm Well ok, no reply is also an answer...
What I've managed to achieve so far:
1. Compiled nnue.c separately
2. Initialized weights from file (well at least think so...)
Now in order to call nnue_evaluate(Position *pos) the only thing I have a lack of is position object.
First I was trying to initialize it from FEN, but getting segmentation fault all the time...
But then I've realized that probably there might not be a need of placing pieces on board
because the only fields of position object used in nnue.c are dirtyPiece and accumulator
So let me narrow my question from why life is unfair to code monkeys to the following:
1. Is anyone aware of what are dirtyPiece and accumulator
2. HOW can I initialize them?
DirtyPiece is only needed for incremental updating.
Maybe you can try first to do without incremental evaluation (i.e. no need to update dirtyPiece or accumulator in make).
It is only 14-18% slower. To disable incremental update comment out this line:
https://github.com/syzygy1/Cfish/blob/m ... ue.c#L1025
Then you don't have to worry about updating accumulator or dirtyPiece in make move.
If I don't need neither dirtyPiece nor accumulator then I don't need Position *pos at all? Is that correct?
But then I feel completely lost while trying to understand HOW board position is used as an input to get score from NNUE?
OMG why is this so complicated (rhetoric question)
Why don't somebody smarter than I create a standalone NNUE program that would take FEN as input and give score as output?
Is that possible? Maybe someone has done it already?
That would be the best source of learning for me.
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 775
- Joined: Sat Sep 08, 2018 5:37 pm
- Location: Ukraine
- Full name: Maksim Korzh
Re: Hacking around CFish NNUE
Yeah guys, just to avoid keep torturing you with my dumbness I would ask question in a bit different way.
In a perfect world I would like to get the following program:
1. Take FEN string as input
2. Return NNUE score as output
That's it.
Please don't tell me this is SLOW and doesn't make sense.
Just tell me - is that possible?
If so - what steps to take to create that program?
Or maybe someone has done it before?
In a perfect world I would like to get the following program:
1. Take FEN string as input
2. Return NNUE score as output
That's it.
Please don't tell me this is SLOW and doesn't make sense.
Just tell me - is that possible?
If so - what steps to take to create that program?
Or maybe someone has done it before?
Didactic chess engines:
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
https://www.chessprogramming.org/Maksim_Korzh
Chess programming YouTube channel:
https://www.youtube.com/channel/UCB9-pr ... KKqDgXhsMQ
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: Hacking around CFish NNUE
Think of Position*, containing your FEN and (Accumulator and DirtyPiece) structures.
NNUE populate these structures using the function.
Modify that to be based on your FEN rather than the bitboards code that it assumes the engine uses exactly like Stockfish does.
Also don't forget to comment out the incremental update as I mentioned, otherwise it will touch parts of the code that are corrupt.
If you are frustrated, you can wait for me to add NNUE it to my library that already does EGTB and NN probe
NNUE populate these structures using the function.
Code: Select all
void half_kp_append_active_indices
Also don't forget to comment out the incremental update as I mentioned, otherwise it will touch parts of the code that are corrupt.
If you are frustrated, you can wait for me to add NNUE it to my library that already does EGTB and NN probe