The next revolution in computer chess?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
User avatar
Rebel
Posts: 5810
Joined: Thu Aug 18, 2011 10:04 am

The next revolution in computer chess?

Post by Rebel » Thu Jul 23, 2020 3:13 pm

I am inclined to believe so.

AB engines going NN.

I wrote a short page just to give the new technique more attention.

http://rebel13.nl/download/stockfish-nnue.html
90% of coding is debugging, the other 10% is writing bugs.

schack
Posts: 155
Joined: Thu May 27, 2010 1:32 am
Contact:

Re: The next revolution in computer chess?

Post by schack » Thu Jul 23, 2020 3:24 pm

Looks good. You may want to give direct links to the individual bin files, though?

jmartus
Posts: 241
Joined: Sun May 16, 2010 12:50 am

Re: The next revolution in computer chess?

Post by jmartus » Thu Jul 23, 2020 3:45 pm

thank you for making the process alot easier wasn't sure where to find the exe.

User avatar
mclane
Posts: 18388
Joined: Thu Mar 09, 2006 5:40 pm
Location: US of Europe, germany
Full name: Thorsten Czub
Contact:

Re: The next revolution in computer chess?

Post by mclane » Thu Jul 23, 2020 6:30 pm

As it seems the human programmer gets more and more pruned away.

Time to strike back.
What seems like a fairy tale today may be reality tomorrow.
Here we have a fairy tale of the day after tomorrow....

matejst
Posts: 228
Joined: Mon May 14, 2007 6:20 pm
Full name: Boban Stanojević

Re: The next revolution in computer chess?

Post by matejst » Thu Jul 23, 2020 7:58 pm

Here, we should give Jonathan Rosenthal some credit: he implemented a NN in an AB engine a year ago, before all this NNUE buzz. I think he is still one step ahead, combining different nets for particular parts of the evaluation. Jonathan can probably explain it better than I can, but I think he combines nets with handcrafted elements in the evaluation function.

Of course, all of this remained without the deserved attention because Winter's search is far from SF's.

dkappe
Posts: 857
Joined: Tue Aug 21, 2018 5:52 pm
Full name: Dietrich Kappe

Re: The next revolution in computer chess?

Post by dkappe » Thu Jul 23, 2020 8:19 pm

mclane wrote:
Thu Jul 23, 2020 6:30 pm
As it seems the human programmer gets more and more pruned away.

Time to strike back.
I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.

User avatar
Rebel
Posts: 5810
Joined: Thu Aug 18, 2011 10:04 am

Re: The next revolution in computer chess?

Post by Rebel » Thu Jul 23, 2020 9:08 pm

schack wrote:
Thu Jul 23, 2020 3:24 pm
Looks good. You may want to give direct links to the individual bin files, though?
The Sergio site daily offers one or two new nets, I found that more important than direct downloads.
90% of coding is debugging, the other 10% is writing bugs.

chrisw
Posts: 3874
Joined: Tue Apr 03, 2012 2:28 pm

Re: The next revolution in computer chess?

Post by chrisw » Thu Jul 23, 2020 9:31 pm

dkappe wrote:
Thu Jul 23, 2020 8:19 pm
mclane wrote:
Thu Jul 23, 2020 6:30 pm
As it seems the human programmer gets more and more pruned away.

Time to strike back.
I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.
Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.

dkappe
Posts: 857
Joined: Tue Aug 21, 2018 5:52 pm
Full name: Dietrich Kappe

Re: The next revolution in computer chess?

Post by dkappe » Thu Jul 23, 2020 10:44 pm

chrisw wrote:
Thu Jul 23, 2020 9:31 pm

Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.
Most of my experience training chess neural networks has been on the leela chess zero project. An nnue is a very different beast — a much smaller, simpler network that through dint of clever design and programming can run very fast on cpu. I’m new to training nnue’s, so most of what I say is speculation, but informed speculation.

On the point of different data == different play (better can be determined by playing games), since a nnue approximates a real valued function over chess positions, a different function, like Toga II’s eval, will produce different data and a different approximated function. Then the question will be, does the stockfish search influence style (through razoring, etc.)? No idea. You’ll be able to find out yourself in a few days.

On the point of training on score vs result (in the leela world we call it Q and Z), in leela chess we originally trained just on Z (result), then started mixing in score (win pctg or Q). While it did seem to help speed up the training, there is no consensus on whether the end result is stronger for it.

Now most of the nnue nets have copied the approach from shogi: first train exclusively on Q from a large amount of relatively cheap data, then sharpen the net on a smaller amount of expensive data with 70% Q and 30% Z. Now part of the challenge with Z is that in order to generate diverse positions, the game generation plays games with the occasional random move thrown in, especially king moves. As you can imagine, the more random moves are thrown in, the more suspect Z becomes. There are similar artifacts in leela’s training games.

We’ll know more in a few weeks.

Raphexon
Posts: 369
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: The next revolution in computer chess?

Post by Raphexon » Thu Jul 23, 2020 11:18 pm

chrisw wrote:
Thu Jul 23, 2020 9:31 pm
dkappe wrote:
Thu Jul 23, 2020 8:19 pm
mclane wrote:
Thu Jul 23, 2020 6:30 pm
As it seems the human programmer gets more and more pruned away.

Time to strike back.
I think the picture is more nuanced. Right now a hand tuned eval function at some search depth is approximated by the nnue. More data == better approximation. Up until now only stockfish’s eval has been used, most commonly at depth 8. But the training software will accept text data and convert it into its training format. So it’s possible to generate training data with other engines.

I’ve been working on generating nets with other engines and will be releasing some of them shortly. One thing that’s occurred to me is that old, slow engines with good evals might benefit quite a bit from conversion to nnue. The new competition might be finding better evals for training a nnue. So, dust off all the brilliant ideas you shelved because they were impractical and would reduce your nps by an order of magnitude. Their time may have come.
Okay, I will play contrarian devils advocate here. You won’t find a better evaluation for training than the actual game result, and for every training position you know that already. But in practice training uses a meld of game result and, let’s say, SF eval, to train on. Apart from “because it works”, why? Serious question.

Next, what are the grounds for the assumption that using something other than SF (you argue a different knowledge base) will generate different/better? You could argue that a “more speculative” engine will get you something to train on “closer to the result”, but you already know the result, so why not just increase the game result weight in the training data?

It does seem intuitively obvious that training on a particular style should produce that kind of style engine. But, is that actually so? I’m not so sure.
Because it won't, at most the NN will play in an anti-[WhatIt'sTrainedOnStyle] way.
Antifish and Stockfish was fireworks, but it didn't play like Stockfish.

But it's still an interesting experiment either way.

Post Reply