Transhuman Chess with NN and RL...

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, chrisw, Rebel

smatovic
Posts: 3020
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Transhuman Chess with NN and RL...

Post by smatovic »

Heyho, some 2 cents on this...

Some people argue that the art of writing a chess engine lies in the evaluation function. A programmer gets into the expert knowledge of the domain of chess and encodes this via evaluation terms in his engine. We had the division between chess advisor and chess programmer, and with speedy computers our search algorithms were able to reach super-human level chess and outperform any human. We developed automatic tuning methods for the values of our evaluation functions but now with Neural Networks and Reinforcement Learning present I wish to point that we entered another kind of level, I call it trans-human level chess. If we look at the game of Go this seems pretty obvious, I recall one master naming the play of A0 "Go from another dimension". A super-human level engine still relies on handcrafted evaluation terms human do come up with (and then get tuned), but a Neural Network is able to encode evaluation terms humans simply do not come up with, to 'see' relations and patterns we can not see, which are beyond our scope, trans-human, and the Reinforcement Learning technique discovers lines which are yet uncommon for humans, trans-human. As mentioned, pretty obvious for Go, less obvious for chess, but still applicable. NNs replacing the evaluation function is just one part of the game, people will come up with NN based pruning, move selection, reduction and extension. What is left is the search algorithm, and we already saw the successful mix of NNs with MCTS and classic eval with MCTS, so I am pretty sure we will see different kind of mixtures of already known (search) techniques and upcoming NN techniques. Summing above up, the switch is now from encoding the expert knowledge of chess in evaluation terms to encoding the knowledge into NNs and use them in a search algorithm, that is what the paradigm shift since A0 and Lc0 and recently NNUE is about, and that is the shift to what I call trans-human chess. NNs are also called 'black-boxes' cos we can not decode what the layers of weights represent in an human-readable form, so I see here some room for the classic approach, can we decode the black-box and express the knowledge via handcrafted evaluation terms in our common programming languages? Currently NNs outperform human expert-systems in many domains, this not chess or Go specific, but maybe the time for the question of reasoning will come, a time to decode the black-boxes, or maybe the black-box will decode itself, yet another level, time will tell.

--
Srdja
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Transhuman Chess with NN and RL...

Post by Gerd Isenberg »

Nice conclusion on the recent developments yielding to transhuman chess. Question of reasoning is a huge problem, in particular in fields of autonomous driving cars and that like.
jefk
Posts: 838
Joined: Sun Jul 25, 2010 10:07 pm
Location: the Netherlands
Full name: Jef Kaan

Re: Transhuman Chess with NN and RL...

Post by jefk »

similar as Gerd I, mr smatovic made a good point regarding the new
sorts of search methods. and to be clear, we now have some 3 methods
(yes its a bit oranges and apples, coz i mix search/eval
but i want to present a broad overview of current developments,
for my 2 cts worth)

1_ alfa-beta (still going strong) + some extra tricks (null move, extensions,
whatever, all the supertricks which were inventend in earlier SF)
2_ mcts (eg komodo Mcts)
3) nn (Leela/Lco using Mcts, but SSfnue apparently not (yet)

Now... as we know, mcts isn't good (at least not perfect) in tactics
whereby alfabeta per definition is perfect (assuming the eval is good)
but not so deep. Improve it with NN, like SSFNUE and you
get a strong(er) engine). However still with horizon effect,
no planning strategy, etc. So.. How to combine mcts/alfa-beta ?

Personally i have no idea (yet), one method sometimes discussed
here is to simply use some cpu threads for the alfa-beta (with NN)
and some other threads for the Mcts. Another methods is(was?)
combining 2 engines but the algorithmic problem then remains
the same, how to select the best thread (or engine), depending
on the position; imho you can't simply go by resulting eval
and then gambling on one of the two, there should be an overall
judgement algorithm (with possibly further search) to
really get the best of both worlds (just a suggestion).
Some work for some (of the best) programmers maybe
:)
PS ofcourse this is only if you want to research into the really
best chess possible (on which we will hit a limit in a few years, i
guess); as you may (or not) know i also have an interest in
human chess, playing styles, and openings (etc.). But from
the absolute topchess programs we mortal humans still
can learn sometimes how to improve our own chess.
(personally i'm still attracted to the idea of having
a sharpness/unbalance indicator in positions, which,
depending on style (and/or of your opponent) a human
chess player may be useful in choosing an opening,
or middle game strategy
PS2 chessdb.cn now only seems to display xiangi,
but it got boring anyway, coz most 'best' lines
were anyway leading to equal positions (they can improve
with 1_ using SfNNU, and 2), using 3 parameters in valued
approach a) end eval of (opening position)
b) statistics (from huge base 2300+ comp and corresp games included)
c) 'sharpness' indicator (yes it existed, in some Fritz versions,
how they did it i dunno, but nowadays also in the chess.com
post mortem analysis they (verbally) indicate if a game was 'sharp'
(or 'wild').
Most of my games on chess.com seem to usually 'wild'
but that's another story (LOL
:)
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Transhuman Chess with NN and RL...

Post by Madeleine Birchfield »

smatovic wrote: Fri Oct 30, 2020 9:01 am Some people argue that the art of writing a chess engine lies in the evaluation function. A programmer gets into the expert knowledge of the domain of chess and encodes this via evaluation terms in his engine. We had the division between chess advisor and chess programmer, and with speedy computers our search algorithms were able to reach super-human level chess and outperform any human.
smatovic wrote: Fri Oct 30, 2020 9:01 am NNs replacing the evaluation function is just one part of the game, people will come up with NN based pruning, move selection, reduction and extension. What is left is the search algorithm, and we already saw the successful mix of NNs with MCTS and classic eval with MCTS, so I am pretty sure we will see different kind of mixtures of already known (search) techniques and upcoming NN techniques.
Handcrafted evaluations still have a place in computer chess, if only as an accelerant in neural network training. Training a neural network on a very strong handcrafted evaluation such as the handcrafted evaluations of Stockfish or Komodo will result in a much stronger network than training the same neural network from a poor handcrafted evaluation or from zero or random. This could be seen with Minic, whose Nascent Nutrient net trained on its own evaluation function is much weaker than the Napping Nexus net trained on Stockfish's evaluation function. And because training networks on Stockfish's evaluation is somewhat frowned upon in this community, this would motivate people to improve their own handcrafted evaluation function.
smatovic
Posts: 3020
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Transhuman Chess with NN and RL...

Post by smatovic »

Madeleine Birchfield wrote: Sat Dec 05, 2020 11:39 pm
smatovic wrote: Fri Oct 30, 2020 9:01 am ...
Handcrafted evaluations still have a place in computer chess, if only as an accelerant in neural network training. Training a neural network on a very strong handcrafted evaluation such as the handcrafted evaluations of Stockfish or Komodo will result in a much stronger network than training the same neural network from a poor handcrafted evaluation or from zero or random. This could be seen with Minic, whose Nascent Nutrient net trained on its own evaluation function is much weaker than the Napping Nexus net trained on Stockfish's evaluation function. And because training networks on Stockfish's evaluation is somewhat frowned upon in this community, this would motivate people to improve their own handcrafted evaluation function.
AlphaGo switched from Supervised Learning to Reinforcement Learning in A0 cos
it was superior. In computer chess we have already good HCE and a much smaller
game tree complexity, and the NNUE networks are currently smaller than Lc0 ones,
so the gain from the switch to RL in NNUE might not be that big like in Go but
I guess that at some point it will pay off, the alternative is that HCE and NN
become a kind of binary star, both advancing with each other.

--
Srdja
Madeleine Birchfield
Posts: 512
Joined: Tue Sep 29, 2020 4:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Transhuman Chess with NN and RL...

Post by Madeleine Birchfield »

smatovic wrote: Sun Dec 06, 2020 7:21 am AlphaGo switched from Supervised Learning to Reinforcement Learning in A0 cos
it was superior. In computer chess we have already good HCE and a much smaller
game tree complexity, and the NNUE networks are currently smaller than Lc0 ones,
so the gain from the switch to RL in NNUE might not be that big like in Go but
I guess that at some point it will pay off, the alternative is that HCE and NN
become a kind of binary star, both advancing with each other.

--
Srdja
The other thing is that the neural networks in recent alpha-beta engines are different from those in Leela or AlphaZero, they are trained with reinforcement learning to predict what the handcrafted evaluation will be in a certain position, rather than the win/draw/lose percentage of the game.
smatovic
Posts: 3020
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Transhuman Chess with NN and RL...

Post by smatovic »

Madeleine Birchfield wrote: Sun Dec 06, 2020 7:36 am
smatovic wrote: Sun Dec 06, 2020 7:21 am ..
The other thing is that the neural networks in recent alpha-beta engines are different from those in Leela or AlphaZero, they are trained with reinforcement learning to predict what the handcrafted evaluation will be in a certain position, rather than the win/draw/lose percentage of the game.
Another point, for some things RL makes no sense, if you have 7 men end-game-
table-bases present you can use those as a shortcut.

--
Srdja