So, how many of you are working on neural networks for chess?

Daniel Shawul · Post by **Daniel Shawul** » Sat Feb 02, 2019 4:15 pm

jorose wrote: ↑Sat Feb 02, 2019 2:38 pm
Daniel Shawul wrote: ↑Sat Feb 02, 2019 7:00 am
jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.

I think there is probably room for a hybrid approach where maybe the NN is suggesting moves to a deep searcher, but I am not aware of anyone pursuing that.

Personally I have a pretty long to-do list for my non-NN engine and I am not planning to drop everything and start building a NN.

--Jon
My NN engine never makes a blunder because I spend 20% of the time doing a mulitpv search calculating scores for root moves,
and then combining that with MCTS scores like: 0.2 * ABscore + 0.8 * MCTSscore. That way the selection of moves at the root is biased
by the alphabeta prior score and that makes it avoid almost all blunders.
How is ScorpioNN coming along?

I have added policy networks now which seems to have helped quite a bit.
They seem to add a lot of positional knowledge besides making my inference faster (used to do qsearch for policy before)

Also I have added support to leela nets now that i am doing policy, so with ID 32742 network of leela
it should be able to beat stockfish sometimes

Daniel

jorose · Post by **jorose** » Sat Feb 02, 2019 4:38 pm

Daniel Shawul wrote: ↑Sat Feb 02, 2019 4:15 pm
jorose wrote: ↑Sat Feb 02, 2019 2:38 pm
Daniel Shawul wrote: ↑Sat Feb 02, 2019 7:00 am
jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.

I think there is probably room for a hybrid approach where maybe the NN is suggesting moves to a deep searcher, but I am not aware of anyone pursuing that.

Personally I have a pretty long to-do list for my non-NN engine and I am not planning to drop everything and start building a NN.

--Jon
My NN engine never makes a blunder because I spend 20% of the time doing a mulitpv search calculating scores for root moves,
and then combining that with MCTS scores like: 0.2 * ABscore + 0.8 * MCTSscore. That way the selection of moves at the root is biased
by the alphabeta prior score and that makes it avoid almost all blunders.
How is ScorpioNN coming along?
I have added policy networks now which seems to have helped quite a bit.
They seem to add a lot of positional knowledge besides making my inference faster (used to do qsearch for policy before)

Also I have added support to leela nets now that i am doing policy, so with ID 32742 network of leela
it should be able to beat stockfish sometimes

Daniel

Awesome! I am really looking forward to seeing how well your approach works!

jackd · Post by **jackd** » Sat Feb 02, 2019 7:52 pm

brianr wrote: ↑Sat Feb 02, 2019 4:11 am
jorose wrote: ↑Sat Feb 02, 2019 12:50 am One of the main reasons I am not working on an NN based engine is that I feel I don't have the resources. I think very few people have the resources to seriously train and test a NN based engine.
Reinforcement learning based NN engines are even worse in that regard.

I don't understand how it matters whether Leela or SF is stonger. Depends on the conditions anyways.
My resources include a 2 year-old GTX 1070 GPU. With that I was able to train (supervised learning) a 10x128 Leela net from the "standard CCRL" dataset (link below). It took about a week, and it is competitive with Crafty on between 2 and 4 CPUs, so about 2,900 Elo. It is not as serious as the larger Leela project, but has been quite rewarding learning enough to do it.

http://blog.lczero.org/2018/09/a-standard-dataset.html

Could you summarize how Leela trains off pgns?

DustyMonkey · Post by **DustyMonkey** » Sat Feb 02, 2019 8:14 pm

jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.

Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.

jackd · Post by **jackd** » Sat Feb 02, 2019 8:28 pm

DustyMonkey wrote: ↑Sat Feb 02, 2019 8:14 pm
jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.
Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.

MCTS, being best-first, spends more time on the best looking moves, at the expense of bad looking ones, similar to humans and unlike alpha-beta. This may have benefits when you have a very strong static eval and/or a policy network.

jorose · Post by **jorose** » Sat Feb 02, 2019 10:14 pm

jackd wrote: ↑Sat Feb 02, 2019 8:28 pm
DustyMonkey wrote: ↑Sat Feb 02, 2019 8:14 pm
jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.
Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.
MCTS, being best-first, spends more time on the best looking moves, at the expense of bad looking ones, similar to humans and unlike alpha-beta. This may have benefits when you have a very strong static eval and/or a policy network.

"Provably inferior"? Could you elaborate this? I am aware of a proof that it converges on a mini-max optimal solution and I could imagine that it might not be the most efficient at reaching this, but there is a big difference between those two options.

"MCTS, being best-first, spends more time on the best looking moves, at the expense of bad looking ones, similar to humans and unlike alpha-beta." Except that Alpha-Beta definitely also does this. Even without the heavy pruning and reductions found in all modern top 100 programs, simply by virtue of aspiration bounds getting tighter for moves later in the move list and refutations simplifying search, an AB engine will spend far more time on a variation it percieves as good compared to one not expected to be good.

I wouldn't say either search algorithm is similar to what humans do and I'd argue both have their strengths and weaknesses.

grahamj · Post by **grahamj** » Sat Feb 02, 2019 10:50 pm

DustyMonkey wrote: ↑Sat Feb 02, 2019 8:14 pm
Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.

The tree search algorithm used by AlphaZero and LeelaZero is deterministic and not well suited to paralellization. I don't know of anyone using a true Monte Carlo algorithm for chess.

jackd · Post by **jackd** » Sat Feb 02, 2019 11:01 pm

jorose wrote: ↑Sat Feb 02, 2019 10:14 pm
jackd wrote: ↑Sat Feb 02, 2019 8:28 pm
DustyMonkey wrote: ↑Sat Feb 02, 2019 8:14 pm
jdart wrote: ↑Sat Feb 02, 2019 3:41 am I think still NN engines even on higher-end commercial graphics cards play blunders at a pretty high rate.
Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.
MCTS, being best-first, spends more time on the best looking moves, at the expense of bad looking ones, similar to humans and unlike alpha-beta. This may have benefits when you have a very strong static eval and/or a policy network.
"Provably inferior"? Could you elaborate this? I am aware of a proof that it converges on a mini-max optimal solution and I could imagine that it might not be the most efficient at reaching this, but there is a big difference between those two options.

"MCTS, being best-first, spends more time on the best looking moves, at the expense of bad looking ones, similar to humans and unlike alpha-beta." Except that Alpha-Beta definitely also does this. Even without the heavy pruning and reductions found in all modern top 100 programs, simply by virtue of aspiration bounds getting tighter for moves later in the move list and refutations simplifying search, an AB engine will spend far more time on a variation it percieves as good compared to one not expected to be good.

I wouldn't say either search algorithm is similar to what humans do and I'd argue both have their strengths and weaknesses.

In PVS, all moves are searched to the same depth, no matter how good they appear. Any move but the first can yield a cutoff, but only if it is provably worse than the PV at the same depth.

jackd · Post by **jackd** » Sat Feb 02, 2019 11:19 pm

grahamj wrote: ↑Sat Feb 02, 2019 10:50 pm
DustyMonkey wrote: ↑Sat Feb 02, 2019 8:14 pm
Its not the NN evaluation that is the problem. Its the monte-carlo tree search, and it cannot be solved with tweaks to the evaluation or the monte-carlo tree search.

The solution is to stop using MCTS. It is a provably inferior tree search algorithm. The fact that its well suited to parallelization has not and will not remove its fundamental flaw, and this is while it is already given orders-of-magnitude more resources than anything else.
The tree search algorithm used by AlphaZero and LeelaZero is deterministic and not well suited to paralellization. I don't know of anyone using a true Monte Carlo algorithm for chess.

MCTS consists of doing the following while there is time left: do a rollout and then update the tree. Alpha-Beta and PVS, unlike mini-max and MCTS, must search one child entirely before continuing to the next, at every level in the tree. This is especially bad for paralellization, whereas there is little harm in doing multiple rollouts at the same time.

brianr · Post by **brianr** » Sun Feb 03, 2019 1:45 am

jackd wrote: ↑Sat Feb 02, 2019 7:52 pm Could you summarize how Leela trains off pgns?

In terms of the actual net training process, Leela learns pretty much the same way with PGNs. The major practical difference is that the PGN game positions must first be converted into the input bit planes which are fed to the Leela net training code (Python and TensorFlow). Fortunately, the sample dataset (11GB compressed) includes both the PGNs and the bit plane files (2MM, IIRC, one file for each game), so it is much easier. The early versions of the SL code to convert PGN games to the bit plane format were a bit quirky, although I think there are some forks that might have fixed it. With the sample dataset that part can be skipped.

Generally, when learning from self-play games it is reinforcement learning (RL), and from PGNs it is supervised learning (SL). Which PGN games are selected and the order in which they are fed to the training process can have a significant impact on the resulting net strength. SL can train a net faster, but there is some debate about the ultimate result being as strong as RL. The massive crowd-sourced Leela training effort is primarily to provide self-play games for RL, and some match testing. Unfortunately, early on that was also called "training", which is not the same thing as using Tensorflow to actually train the weights in the net. Quite a few folks are also experimenting with SL.

So, how many of you are working on neural networks for chess?

So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?

Re: So, how many of you are working on neural networks for chess?