The next revolution in computer chess?

Rebel · Post by **Rebel** » Thu Jul 23, 2020 5:13 pm

I am inclined to believe so.

AB engines going NN.

I wrote a short page just to give the new technique more attention.

http://rebel13.nl/download/stockfish-nnue.html

schack · Post by **schack** » Thu Jul 23, 2020 5:24 pm

Looks good. You may want to give direct links to the individual bin files, though?

jmartus · Post by **jmartus** » Thu Jul 23, 2020 5:45 pm

thank you for making the process alot easier wasn't sure where to find the exe.

mclane · Post by **mclane** » Thu Jul 23, 2020 8:30 pm

As it seems the human programmer gets more and more pruned away.

Time to strike back.

matejst · Post by **matejst** » Thu Jul 23, 2020 9:58 pm

Here, we should give Jonathan Rosenthal some credit: he implemented a NN in an AB engine a year ago, before all this NNUE buzz. I think he is still one step ahead, combining different nets for particular parts of the evaluation. Jonathan can probably explain it better than I can, but I think he combines nets with handcrafted elements in the evaluation function.

Of course, all of this remained without the deserved attention because Winter's search is far from SF's.

dkappe · Post by **dkappe** » Thu Jul 23, 2020 10:19 pm

mclane wrote: ↑Thu Jul 23, 2020 8:30 pm As it seems the human programmer gets more and more pruned away.

Time to strike back.

I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.

Rebel · Post by **Rebel** » Thu Jul 23, 2020 11:08 pm

schack wrote: ↑Thu Jul 23, 2020 5:24 pm Looks good. You may want to give direct links to the individual bin files, though?

The Sergio site daily offers one or two new nets, I found that more important than direct downloads.

chrisw · Post by **chrisw** » Thu Jul 23, 2020 11:31 pm

dkappe wrote: ↑Thu Jul 23, 2020 10:19 pm
mclane wrote: ↑Thu Jul 23, 2020 8:30 pm As it seems the human programmer gets more and more pruned away.

Time to strike back.
I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.

Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.

dkappe · Post by **dkappe** » Fri Jul 24, 2020 12:44 am

chrisw wrote: ↑Thu Jul 23, 2020 11:31 pm
Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.

Most of my experience training chess neural networks has been on the leela chess zero project. An nnue is a very different beast — a much smaller, simpler network that through dint of clever design and programming can run very fast on cpu. I’m new to training nnue’s, so most of what I say is speculation, but informed speculation.

On the point of different data == different play (better can be determined by playing games), since a nnue approximates a real valued function over chess positions, a different function, like Toga II’s eval, will produce different data and a different approximated function. Then the question will be, does the stockfish search influence style (through razoring, etc.)? No idea. You’ll be able to find out yourself in a few days.

On the point of training on score vs result (in the leela world we call it Q and Z), in leela chess we originally trained just on Z (result), then started mixing in score (win pctg or Q). While it did seem to help speed up the training, there is no consensus on whether the end result is stronger for it.

Now most of the nnue nets have copied the approach from shogi: first train exclusively on Q from a large amount of relatively cheap data, then sharpen the net on a smaller amount of expensive data with 70% Q and 30% Z. Now part of the challenge with Z is that in order to generate diverse positions, the game generation plays games with the occasional random move thrown in, especially king moves. As you can imagine, the more random moves are thrown in, the more suspect Z becomes. There are similar artifacts in leela’s training games.

We’ll know more in a few weeks.

Raphexon · Post by **Raphexon** » Fri Jul 24, 2020 1:18 am

chrisw wrote: ↑Thu Jul 23, 2020 11:31 pm
dkappe wrote: ↑Thu Jul 23, 2020 10:19 pm
mclane wrote: ↑Thu Jul 23, 2020 8:30 pm As it seems the human programmer gets more and more pruned away.

Time to strike back.
I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.
Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.

Because it won't, at most the NN will play in an anti-[WhatIt'sTrainedOnStyle] way.
Antifish and Stockfish was fireworks, but it didn't play like Stockfish.

But it's still an interesting experiment either way.

The next revolution in computer chess?

The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?

Re: The next revolution in computer chess?