NNUE scoring (egbb lib)

Desperado · Post by **Desperado** » Wed May 19, 2021 9:56 pm

Hello everybody.

I did already read the thread but i still need to ask again.

1. What range of scores will the NNUE provide. I want to be sure, there are no scores outside my mate bounds ?

2. I understood that scaling is not necessary, except when reference values in your own engine have a different base.
Now, what base is the NNUE score 100 corresponding to? 100cp so to say?

My first implementation for the NNUE evaluation looks like that

Code: Select all

#if USE_NNUE
int Eval::nnue(thread_t *pt, pos_t *pos)
{
    // piece definitions
    enum {
        wking=1, wqueen, wrook, wbishop, wknight, wpawn,
        bking, bqueen, brook, bbihop, bknight, bpawn
    };

    // piece mapping
    static const int map_piece[ID16] ={
        0,0,wpawn,bpawn,wknight,bknight,wbishop,bbishop,
        wrook,brook,wqueen,bqueen,wking,bking
    };

    // position info
    int piece[33], square[33], id = 2, src;

    // kings
    piece[0] = wking; square[0] = Bit::lsb_id(pos->bb[WK]);
    piece[1] = bking; square[1] = Bit::lsb_id(pos->bb[BK]);

    // scan board
    uint64_t tmpsrc =  Pos::occupied(pos) ^ (pos->bb[WK] | pos->bb[BK]);
    while(tmpsrc)
    {
        src = Bit::pop_lsb(&tmpsrc);

        piece[id]  = map_piece[pos->sq[src]];
        square[id] = src;
        id++;
    }

    piece[id] = 0; // end of list

    return nnue_evaluate(pos->stm, piece, square);
}
#endif

Will this be the correct replacement for my standard eval function in the pov context?

Code: Select all

int Eval::full(thread_t* pt,pos_t* pos)
{
    // ... //
    
    // tapered evaluation
    int phase = min(Bit::popcnt(Pos::minors(pos)) * 1
                    + Bit::popcnt(Pos::rooks(pos)) * 2
                    + Bit::popcnt(Pos::queens(pos)) * 4, 24);

    int score = (mg * phase + eg * (24 - phase)) / 24;

    // pov evaluation
    return pos->stm == WHITE ? score + tempo : -score + tempo;
}

Of course i tried some test matches and there is an elo boost. But something feels wrong.
Some (totally won) positions like "8/4k3/5pp1/P3n3/8/8/R6K/4q3 w - - 0 83" are lost on time because there is no progress.
The engine moves from one to another winning position. (It remembers me on delaying mate scores when the distance to mate is not handled correctly in the hash tables.)

This behaviour does not exist when i use my standard evaluation.

Many thanks in advance.

Ferdy · Post by **Ferdy** » Thu May 20, 2021 9:35 am

Desperado wrote: ↑Wed May 19, 2021 9:56 pm Hello everybody.

I did already read the thread but i still need to ask again.

1. What range of scores will the NNUE provide. I want to be sure, there are no scores outside my mate bounds ?

Depends on how the nnue is generated. Another way to check is by testing it with mate in 1 and mated in 1 position.

2. I understood that scaling is not necessary, except when reference values in your own engine have a different base.
Now, what base is the NNUE score 100 corresponding to? 100cp so to say?

Depends on how the nnue is created. You can also optimize its scale.

Of course i tried some test matches and there is an elo boost. But something feels wrong.
Some (totally won) positions like "8/4k3/5pp1/P3n3/8/8/R6K/4q3 w - - 0 83" are lost on time because there is no progress.
The engine moves from one to another winning position. (It remembers me on delaying mate scores when the distance to mate is not handled correctly in the hash tables.)

This behaviour does not exist when i use my standard evaluation.

Try to scale it with fifty-move counter.

Desperado · Post by **Desperado** » Fri May 21, 2021 10:27 pm

Ferdy wrote: ↑Thu May 20, 2021 9:35 am
Desperado wrote: ↑Wed May 19, 2021 9:56 pm Hello everybody.

I did already read the thread but i still need to ask again.

1. What range of scores will the NNUE provide. I want to be sure, there are no scores outside my mate bounds ?
Depends on how the nnue is generated. Another way to check is by testing it with mate in 1 and mated in 1 position.

2. I understood that scaling is not necessary, except when reference values in your own engine have a different base.
Now, what base is the NNUE score 100 corresponding to? 100cp so to say?
Depends on how the nnue is created. You can also optimize its scale.

Of course i tried some test matches and there is an elo boost. But something feels wrong.
Some (totally won) positions like "8/4k3/5pp1/P3n3/8/8/R6K/4q3 w - - 0 83" are lost on time because there is no progress.
The engine moves from one to another winning position. (It remembers me on delaying mate scores when the distance to mate is not handled correctly in the hash tables.)

This behaviour does not exist when i use my standard evaluation.

Try to scale it with fifty-move counter.

Hello Ferdy,

thanks for your answer, unfortunately i was not able to reply earlier. I thought of a simple start to get in touch with NNUE.

1. So, i checked the repository of Daniel Shawul and compiled the egbb sources including the NNUE framework.
a. the results were the dll/lib files
b. I wrote a quick an dirty solution (posted before) to get another quick result.
2. Searching the net i found the Stockfish sited https://tests.stockfishchess.org/nns
a. i have dowloaded a some (default) nets
b. I don't know many details about how the net was created.

That is my setup so far and in general it seems to work fine. The replacement of my evaluation improves my engine in the range from 200-250 Elo.

Before writing the post i did another research in the net to get more information. Because i took a stockfish net, i thought i take a look
in some stockfish sources but i am not familiar with them. Anyway, i found what i was interested in.

The material scores that are used in Stockfish 13 are.

Code: Select all

  PawnValueMg   = 126,   PawnValueEg   = 208,
  KnightValueMg = 781,   KnightValueEg = 854,
  BishopValueMg = 825,   BishopValueEg = 915,
  RookValueMg   = 1276,  RookValueEg   = 1380,
  QueenValueMg  = 2538,  QueenValueEg  = 2682,

So, my guess is, that the net was trained by Stockfish games and there might be some kind of correlation to the values.
Another (more) important point for me is that the piece / pawn relative value is very different to my engine. (e.g. 781 - 126 for mg and not something like 325 - 100). Whatever my solution will be, i will need some scaling to get a better fit with my search value parameters.
I will need to think what else can be effected because of the metioned ideas.

Stockfish uses scaling too, but i think it is different to handle due to the different relations of the material values.
A simple scaling with a factor x = 2.5 would fit for the pieces but not for the pawns. (in my engine)

Code: Select all

        // Scale and shift NNUE for compatibility with search and classical evaluation
      auto  adjusted_NNUE = [&](){
         int mat = pos.non_pawn_material() + 2 * PawnValueMg * pos.count<PAWN>();
         return NNUE::evaluate(pos) * (641 + mat / 32 - 4 * pos.rule50_count()) / 1024 + Tempo;
      };

To be honest, i do not understand the scaling with the 50 move counter, at least not in first go.
I think i got the idea to stop moving pieces over the board forever, but how does it help if you scale down any output of the net.
Don't you produce just smaller score and the decisions in the tree keep to be the same. I mean, if i have two scores like 40,60 that are compared
in a node and later i compare 20,30 because i scaled them down. What makes the difference ?

Regards

P.S.:

I have a hard time understanding the behavior of my engine.
In some way the values influence the search behavior, although I have noticed that this almost never happens in single processor mode but too often when the engine uses multiple threads.
My first approaches to research this further will be to bring the scaling for search parameters in harmony with the evaluation or to think through the influence of the hash table/search/rating (since multiple threads have significantly more influence on the hash tables and the behavior is significantly increased).

Desperado · Post by **Desperado** » Sat May 22, 2021 12:24 pm

Ok, i was able to identify the problem and i found a solution.

So, the point is that the nnue evaluation will effect the NPS (about 50-60% drop).
In my engine i poll the clock every 64K nodes when using a time related search.
That means my engine is sensitive to speed especially in fast games like < 10s per game with increment. In my imlementation the effect
will be enforced with multiple threads because only the main thread handles the time management.

Because i wanted to look into the polling topic anyway at some day, i had a good reason to do it now.

My findings were:

1. reading the clock does not take much time for itself.
2. my function is slow because it is doing other things like checking the console input and so on...

My result is:

1. if i am in a time related search my main thread always updates the time now and checks the time conditions.

an implementation detail is (probably i do not need it) is, that i only check the time conditions when the last check was X ms before,
so i use a timestemp when the last check was.

2. I still use the polling (including checking the time as before) to check console input for example (and other stuff ).

As side effect i am able to manage time accuracy with 1ms now which of course makes me very happy. (Tested 2ms and 3ms so far)

Nevertheless, the topics remain interesting in terms of evaluation with regard to scaling.

Ferdy · Post by **Ferdy** » Sat May 22, 2021 9:04 pm

Desperado wrote: ↑Fri May 21, 2021 10:27 pm To be honest, i do not understand the scaling with the 50 move counter, at least not in first go.
I think i got the idea to stop moving pieces over the board forever, but how does it help if you scale down any output of the net.
Don't you produce just smaller score and the decisions in the tree keep to be the same. I mean, if i have two scores like 40,60 that are compared
in a node and later i compare 20,30 because i scaled them down. What makes the difference ?

The eval can show winning moves without knowledge that there is a rule of 50-move draw rule. Even without nnue it is a natural thing to do.

I have a hard time understanding the behavior of my engine.
In some way the values influence the search behavior, although I have noticed that this almost never happens in single processor mode but too often when the engine uses multiple threads.
My first approaches to research this further will be to bring the scaling for search parameters in harmony with the evaluation or to think through the influence of the hash table/search/rating (since multiple threads have significantly more influence on the hash tables and the behavior is significantly increased).

Right all search parameters that are dependent on the eval have to be optimized again.

I had tried to optimize around 3 search and 1 nnue eval parameters before, using optuna below and the result was promising. It is expensive, you have to simulate a high TC of around 200 or more games per simulation (more games are better to lower the uncertainty) to get the objective value for 100 or more simulations or trials.

Test result at TC 15s+50ms, after 30 trials at 200 games/trial only.

Code: Select all

Score of opt vs default: 84 - 68 - 248  [0.520] 400
...      opt playing White: 53 - 27 - 120  [0.565] 200
...      opt playing Black: 31 - 41 - 128  [0.475] 200
...      White vs Black: 94 - 58 - 248  [0.545] 400
Elo difference: 13.9 +/- 21.0, LOS: 90.3 %, DrawRatio: 62.0 %

You can try these frameworks to optimize your search and eval parameters.
https://github.com/kiudee/chess-tuning-tools
https://github.com/fsmosca/Optuna-Game-Parameter-Tuner
https://github.com/fsmosca/Lakas

I created the last 2.

pedrox · Post by **pedrox** » Sun May 23, 2021 11:10 am

As said if you are going to use NNUE in a simple engine (TSCP type) that only uses the evaluation on the leaves of the tree, then it is not necessary to do any tuning or scaling of the evaluation. If the engine uses the evaluation on each node of the search for pruning (razoring, null move, static null move, futility ...) then you have 2 options; either you adjust each margin on each of the pruning types or you adjust the evaluation to fit your pruning. I am using the second option to do less tuning.

But when it comes to scaling, I don't get complicated. In a nnue stockfish network if you check the static evaluation you can see that it values for example a knight or bishop as 700-800, a rook maybe as 1200. You can see that most pieces are worth twice what we are used to think of as good. But not only are the values for material doubled, the positional values are also doubled.

So you can do something as simple as:

eval = evalnnue / 2;

If you want you can tweak it a bit more and do something like:

eval = evalnnue * 100 / scale; and scale could be a value for example between 150 and 250 and tuned easily.

In my case I used 208 to match the stockfish pawn value (100 for me), although it is not necessary and with this my engine gained maybe 70 Elo points.

The term tempo is not necessary to include in the eval as the network do know about it. What is necessary is to use the 50 move rule to advance (just as you would do without nnue).

eval = (eval* (100 - rule50) / 100);

In my engine I created a nnue network from scratch, used pgn games to create the network by learning by results, not evaluation from stockfish or another engine and then in a second step learned with this evaluation go by depth 5. I thought that in this case I might not need to scale the evaluation to fit the pruning, however I still use 208 and with it gained about 35 Elo points. So these nets tend to have higher values regardless of how you create them. Possibly each net has its own appropriate value for scale.

Desperado · Post by **Desperado** » Sun May 23, 2021 11:42 am

Hello Pedro,

thanks for your feedback. Scaling down with something simple like "eval = evalnnue / 2" seems to be very natural thing.
In my implementation stage i take the score as it is now, so i know it works in my framework and can be optimized at a later point.
250 Elo (single core) by only plug in a neural net isn't bad at all

The point with the simple scaling is, the the ratio of piece/pawn is very different to most engines especially in the midgame. (showed earlier in this thread). Anyway, training an own net at some point will solve that without extra work and is on my todo list. Most important is that the framework works now and i was able to identify the influence that was caused due to the performance loss.

From technical perspective everything works now like it should.

Desperado · Post by **Desperado** » Mon May 24, 2021 1:15 pm

Hello again.

Because i am working on egtbs ( syzygy implementation ) i thought
i can easily use the scorpio bitbases. So i downloaded them, and with the tablebases there were two dlls.

egbbdll.dll

egbbdll64.dll

My own compile shows a version 4.3 (with nnue support) and the etbbdll64.dll shows a version 4.1. (when loading).
So my own compile crashes when probing the bitbase 5-Men and the mentioned 64bit crashes too.

Does somebody know where i can get the "original" compile with version 4.3. I was not able to find it in his repo.
I think it would be a nice thing to have egbb and nnue functionality available in one dll. That might be a good reason to use bitbases
instead of syzygy. At least i would like to do some comparisons but it does not work so far.

Thx in advance.

Ferdy · Post by **Ferdy** » Mon May 24, 2021 6:07 pm

pedrox wrote: ↑Sun May 23, 2021 11:10 am As said if you are going to use NNUE in a simple engine (TSCP type) that only uses the evaluation on the leaves of the tree, then it is not necessary to do any tuning or scaling of the evaluation. If the engine uses the evaluation on each node of the search for pruning (razoring, null move, static null move, futility ...) then you have 2 options; either you adjust each margin on each of the pruning types or you adjust the evaluation to fit your pruning. I am using the second option to do less tuning.

Add a third, optimize both the search and eval params simultaneously.

Desperado · Post by **Desperado** » Mon May 24, 2021 11:26 pm

Desperado wrote: ↑Mon May 24, 2021 1:15 pm Hello again.

Because i am working on egtbs ( syzygy implementation ) i thought
i can easily use the scorpio bitbases. So i downloaded them, and with the tablebases there were two dlls.

egbbdll.dll

egbbdll64.dll
My own compile shows a version 4.3 (with nnue support) and the etbbdll64.dll shows a version 4.1. (when loading).
So my own compile crashes when probing the bitbase 5-Men and the mentioned 64bit crashes too.

Does somebody know where i can get the "original" compile with version 4.3. I was not able to find it in his repo.
I think it would be a nice thing to have egbb and nnue functionality available in one dll. That might be a good reason to use bitbases
instead of syzygy. At least i would like to do some comparisons but it does not work so far.

Thx in advance.

Ok, my own compile works now. It simply was my bad.

Do you see a difference

Code: Select all

load_egbb((char *)"C:\\_chess\\egbb\\egbb/", 16, 1);

and

Code: Select all

load_egbb((char *)"C:\\_chess\\egbb\\egbb/", 16 * MB1, 1);

But i also did not find a released egbbdll64.dll in version 4.3,
neither in the repository https://github.com/dshawul/Scorpio/releases nor at the website https://sites.google.com/site/dshawul/home.

NNUE scoring (egbb lib)

NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)

Re: NNUE scoring (egbb lib)