Orion 0.7 : NNUE experiment

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

alex67a
Posts: 50
Joined: Mon Sep 10, 2018 10:15 am
Location: Denmark
Full name: Alexander Spence

Re: Orion 0.7 : NNUE experiment

Post by alex67a »

Don't work on Win 7 64bit and Arena gui :cry:
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

alex67a wrote: Mon Sep 14, 2020 10:25 am Don't work on Win 7 64bit and Arena gui :cry:
I'm so sorry :(

1) Have you downloaded the network file and put it in the same directory than the binaries ?

2) Several binaries are provided to cope with different CPUs' architectures and sets of instuctions (en passant, note that only 64-bit CPUs are supported by Orion). Please, try first 'orion64-v0.7.nnue.exe' and let me know if it works.
alex67a
Posts: 50
Joined: Mon Sep 10, 2018 10:15 am
Location: Denmark
Full name: Alexander Spence

Re: Orion 0.7 : NNUE experiment

Post by alex67a »

It is work now
The problem was I renamed the net, in this case don't work

thx
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

alex67a wrote: Mon Sep 14, 2020 11:19 am It is work now
The problem was I renamed the net, in this case don't work
Excellent ! Yes, I'm sorry but the name of the net is "hardcoded" (the NNUE version remains an experiment).
Anyway, thank you for your interest :)
alex67a
Posts: 50
Joined: Mon Sep 10, 2018 10:15 am
Location: Denmark
Full name: Alexander Spence

Re: Orion 0.7 : NNUE experiment

Post by alex67a »

Match: Orion - Orion Nnue (winner)
GUI: Arena
Time: 30 sec for move

Result: +9 =1 -0

Very good work! :wink:
Gabor Szots
Posts: 1362
Joined: Sat Jul 21, 2018 7:43 am
Location: Szentendre, Hungary
Full name: Gabor Szots

Re: Orion 0.7 : NNUE experiment

Post by Gabor Szots »

I could not resist the temptation, test is running and it is going to reach at least 3100.
Gabor Szots
CCRL testing group
User avatar
Sylwy
Posts: 4468
Joined: Fri Apr 21, 2006 4:19 pm
Location: IASI - the historical capital of MOLDOVA
Full name: SilvianR

Re: Orion 0.7 : NNUE experiment

Post by Sylwy »

Gabor Szots wrote: Sun Sep 20, 2020 12:03 pm I could not resist the temptation, test is running and it is going to reach at least 3100.
Right, Gabor ! A little more even in my test (4'+2"):

Image

Cheng 4.39 64-bit has 2943 Elo (CCRL Blitz rating list).
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

David Carteau wrote: Mon Sep 14, 2020 8:58 am (...)
I'm now trying another approach, which is terribly slow, but seems (for now) to work : I built a first net which was able to approximate the expected evaluation scores based on 20 positions. When the approximation was appropriate (average error of each score lower or equal than 0.20, i.e. 20 cp), I increased the number of positions to evaluate. And so on... It seems to work, but I need to find another solution because after two days, I have only reached... 80 positions (with PBIL) and 112 positions (with SGD) !
(...)
I reached yesterday 2304 positions (with SGD), which is terribly sloooooooooooooooooooooow...!

I tried to read litterature and watch videos on SGD and back-propagation. It seems that what I implemented is correct (partial derivatives, chain rule, etc.). I played with learning rates and also find a way to speed up learning by separately adjusting weights of the first layer. But, globally, it remains slow. I think the problem comes from the architecture of the NNUE nets, and in particular the "fork" on the left-side which I do not seem to handle properly.

Since yesterday, I'm trying in parallel another approach which looks promising, but... let's see !
David Carteau
Posts: 121
Joined: Sat May 24, 2014 9:09 am
Location: France
Full name: David Carteau

Re: Orion 0.7 : NNUE experiment

Post by David Carteau »

Yeah ! Yeah ! and... Yeah !!

As reported in another post, I was struggling last weeks with my neural network trainer. I did not understand why things went so wrong.

I first tried a "manual" implementation of weights tuning, with genetic algorithms and then stochastic gradient descent (much more complex than PBIL approach...). Then, I decided to learn and experiment the "normal" way to train neural networks : using standard frameworks.

As I'm now comfortable with Python, I first took a look at Keras. As I did not find how to concatenate layers (which is required with the NNUE architecture), I switched to Pytorch.

Despite the fact that using these frameworks is (finally) quite easy, I failed to train strong networks.

At a time, I wondered if my training data were large enough (115 million unique positions), so I decided to enrich it (now it is composed of 360 million unique positions). I then wondered if the loss function used (sum of squared errors) did not favor too much "high scores" (in absolute value), meaning that the optimisation tried to reduce the error with these "extreme" values (when board is unbalanced) with detriment to more balanced positions.

I saw at this time the post of Vivien, annoucing Minic 3, and discovered the "Seer" engine. I look at the training code and saw that the loss function was based on sigmoid. I wondered then if it was linked to the fact that training is performed against game result (0-1, 1/2-1/2, 1-0). I did not spent time on that because I then understood that using such function could help to reduce my problem of "favored high scores".

This failed, again.

And then... this morning... I discovered the "2-months" old bug. The type of bug that is so small and so... stupid. While refactoring my "experimental" NNUE code (the one used for the 0.7 NNUE version of Orion), at one time, I swapped two parameters in a function call : color and position of the last captured piece on the board... :?

And now... It works ! I'm currently running a tournament with a "young" network (only 2 days old, the training is still ongoing), and it seems to reach ~2850 elo. The current C implementation of evaluation in Orion is quite straightforward, and not opimised (the training produces floats, I now need to decide either to use intrinsics for float dot products, or convert floats in integers, and reuse my previous experimental code, but maybe this will lead to lose some precision due to quantization).

So where am I ?
- I have a neural network trainer that can train neural networks. With good results !
- the trainer reuses the NNUE architecture (the same used in Orion NNUE experiment, in StockFish 12, and now in other engines)
- the training is performed based on positions encountered in CCRL games (40-2 and 40-15 time controls), and on nn-82215d0fd0df.nnueevaluations (not game results)

Where do I want to go (ideally) ?
- try other architectures (I already tried them, actually, but as all failed probably due to my "bug", I have to restart all experiments again...)
- try to see how to perform the training on game results instead of an existing NNUE network, to cut the link with StockFish's evaluation

For the moment, the idea was to be able to successfully train a neural network reusing the NNUE architecture of StockFish. Now, I want to build and train my "own" neural network, with the only goal to improve the strength of current "official" Orion.

The road is still long !
alex67a
Posts: 50
Joined: Mon Sep 10, 2018 10:15 am
Location: Denmark
Full name: Alexander Spence

Re: Orion 0.7 : NNUE experiment

Post by alex67a »

Good luck, David :wink: