Ferdy wrote: ↑Thu May 20, 2021 9:35 am
Desperado wrote: ↑Wed May 19, 2021 9:56 pm
Hello everybody.
I did already read the
thread but i still need to ask again.
1. What range of scores will the NNUE provide. I want to be sure, there are no scores outside my mate bounds ?
Depends on how the nnue is generated. Another way to check is by testing it with mate in 1 and mated in 1 position.
2. I understood that scaling is not necessary, except when reference values in your own engine have a different base.
Now, what base is the NNUE score 100 corresponding to? 100cp so to say?
Depends on how the nnue is created. You can also optimize its scale.
Of course i tried some test matches and there is an elo boost. But something feels wrong.
Some (totally won) positions like "8/4k3/5pp1/P3n3/8/8/R6K/4q3 w - - 0 83" are lost on time because there is no progress.
The engine moves from one to another winning position. (It remembers me on delaying mate scores when the distance to mate is not handled correctly in the hash tables.)
This behaviour does not exist when i use my standard evaluation.
Try to scale it with fifty-move counter.
Hello Ferdy,
thanks for your answer, unfortunately i was not able to reply earlier. I thought of a simple start to get in touch with NNUE.
1. So, i checked the repository of Daniel Shawul and compiled the egbb sources including the NNUE framework.
a. the results were the dll/lib files
b. I wrote a quick an dirty solution (posted before) to get another quick result.
2. Searching the net i found the Stockfish sited
https://tests.stockfishchess.org/nns
a. i have dowloaded a some (default) nets
b. I don't know many details about how the net was created.
That is my setup so far and in general it seems to work fine. The replacement of my evaluation improves my engine in the range from 200-250 Elo.
Before writing the post i did another research in the net to get more information. Because i took a stockfish net, i thought i take a look
in some stockfish sources but i am not familiar with them. Anyway, i found what i was interested in.
The material scores that are used in Stockfish 13 are.
Code: Select all
PawnValueMg = 126, PawnValueEg = 208,
KnightValueMg = 781, KnightValueEg = 854,
BishopValueMg = 825, BishopValueEg = 915,
RookValueMg = 1276, RookValueEg = 1380,
QueenValueMg = 2538, QueenValueEg = 2682,
So, my guess is, that the net was trained by Stockfish games and there might be some kind of correlation to the values.
Another (more) important point for me is that the piece / pawn relative value is very different to my engine. (e.g. 781 - 126 for mg and not something like 325 - 100). Whatever my solution will be, i will need some scaling to get a better fit with my search value parameters.
I will need to think what else can be effected because of the metioned ideas.
Stockfish uses scaling too, but i think it is different to handle due to the different relations of the material values.
A simple scaling with a factor x = 2.5 would fit for the pieces but not for the pawns. (in my engine)
Code: Select all
// Scale and shift NNUE for compatibility with search and classical evaluation
auto adjusted_NNUE = [&](){
int mat = pos.non_pawn_material() + 2 * PawnValueMg * pos.count<PAWN>();
return NNUE::evaluate(pos) * (641 + mat / 32 - 4 * pos.rule50_count()) / 1024 + Tempo;
};
To be honest, i do not understand the scaling with the 50 move counter, at least not in first go.
I think i got the idea to stop moving pieces over the board forever, but how does it help if you scale down any output of the net.
Don't you produce just smaller score and the decisions in the tree keep to be the same. I mean, if i have two scores like 40,60 that are compared
in a node and later i compare 20,30 because i scaled them down. What makes the difference ?
Regards
P.S.:
I have a hard time understanding the behavior of my engine.
In some way the values influence the search behavior, although I have noticed that this almost never happens in single processor mode but too often when the engine uses multiple threads.
My first approaches to research this further will be to bring the scaling for search parameters in harmony with the evaluation or to think through the influence of the hash table/search/rating (since multiple threads have significantly more influence on the hash tables and the behavior is significantly increased).