First public release of Expositor

lithander · Post by **lithander** » Thu Feb 24, 2022 4:52 pm

connor_mcmonigle wrote: ↑Thu Feb 24, 2022 4:14 pm You seem to have some misconception that all engines using neural networks for position evaluation are necessarily homogeneous.

That's a fair point. I honestly wasn't aware that there are radically different architectures all lumped up under the same strange backronym.

How would you suggest someone without a machine learning background to get more knowledge about the current state of that "space" and where the unexplored frontiers are? Any good starting points besides reading the NNUE wiki page?

op12no2 · Post by **op12no2** » Thu Feb 24, 2022 5:02 pm

@sopel does a great job of maintaining a document here: https://github.com/glinscott/nnue-pytor ... cs/nnue.md and the Openbench, Leela and SF discords are great to see what people are thinking and trying. There is a rich space of ideas and experiments out there; far richer than hce has ever been I would hazard. But obviously there is nothing definitive as it's a developing concept.

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 5:07 pm

mvanthoor wrote: ↑Thu Feb 24, 2022 4:38 pm ...

I agree that it's unfortunate TalkChess is in such a sorry state that most engine developers aren't willing to discuss ideas here anymore. Sopel has compiled this nice document which I believe gives a good overview: https://github.com/glinscott/nnue-pytor ... cs/nnue.md. Most chess engine programming discussion does occur on Discord at the moment.

mvanthoor wrote: ↑Thu Feb 24, 2022 4:43 pm ...
AFAIK, Maksim indeed tacked SF12's evaluation on top of BBC; and also AFAIK, SF12 uses NNUE... or at least, a neural network. I understand that Expositor uses its own code and architecture, and that is commendable. Still, it uses third-party data-sets and Stockfish evaluations to create the neural network it uses. This effectively puts Stockfish knowledge into Expositor's network.

You wrote that Maksim's results are the bare minimum one could expect when using a neural network which couldn't be further from the truth given Maksim copied Stockfish 12's highly optimized evaluation function verbatim (using the same weights/inference code).

I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion. Regardless, comparing Expositor to BBC borders on ridiculous given the above. Expositor uses distinct training code, inference code and a distinct topology whilst BBC copies SF 12's evaluation function verbatim.

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 5:14 pm

op12no2 wrote: ↑Thu Feb 24, 2022 5:02 pm @sopel does a great job of maintaining a document here: https://github.com/glinscott/nnue-pytor ... cs/nnue.md and the Openbench, Leela and SF discords are great to see what people are thinking and trying. There is a rich space of ideas and experiments out there; far richer than hce has ever been I would hazard. But obviously there is nothing definitive as it's a developing concept.

+1

mvanthoor · Post by **mvanthoor** » Thu Feb 24, 2022 5:28 pm

connor_mcmonigle wrote: ↑Thu Feb 24, 2022 5:07 pm I agree that it's unfortunate TalkChess is in such a sorry state that most engine developers aren't willing to discuss ideas here anymore. Sopel has compiled this nice document which I believe gives a good overview: https://github.com/glinscott/nnue-pytor ... cs/nnue.md. Most chess engine programming discussion does occur on Discord at the moment.

If most discussions are now on Discord, I'll probably and unfortunately never see them because AFAIK, they're not archived anywhere.

mvanthoor wrote: ↑Thu Feb 24, 2022 4:43 pm You wrote that Maksim's results are the bare minimum one could expect when using a neural network which couldn't be further from the truth given Maksim copied Stockfish 12's highly optimized evaluation function verbatim (using the same weights/inference code).

You keep saying "evaluation function." Doesn't Stockfish 12's evaluation use a neural network already?

And, why would his results not be the minimum? He used an evaluation from a different engine, tacked on top of an engine that, using its own evaluation, managed only 2100 Elo. So he replaced his evaluation with SF's, without touching the search functionality, which was just basic alpha-beta, TT, null move, and maybe one or two other pruning techniques. Therefore I would expect that if you use a more advanced base (more search techniques, etc...) and a neural network optimized and trained for that base, the results could be even better than +850 Elo.

I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion. Regardless, comparing Expositor to BBC borders on ridiculous given the above. Expositor uses distinct training code, inference code and a distinct topology whilst BBC copies SF 12's evaluation function verbatim.

True enough. In that regard, Expositor is more than a few steps ahead, because it at least uses its own code and architecture. However, to be able to generate your own data-set and train a neural network on it using your own engine's evaluation, you would need to have an engine with a HCE first; even if it is only a single set of self-written PST's on top of a bare alpha-beta search (like the very first version of my engine).

Your first data-set and neural network won't be strong, but it may be stronger than the PST-only HCE. Then you can replace the PST-evaluation with this network, and get into the "add search pruning feature / generate data-set / retune" loop, but that will be LOTS of work to train your engine in that way.

Therefore I prefer to first implement a complete HCE (and even there I fake it a bit, because I bootstrap the tuning of the tapered eval on a third-party data-set) and then make the jump to neural networks in, hopefully, a handful of training/tuning sessions.

dkappe · Post by **dkappe** » Thu Feb 24, 2022 5:34 pm

connor_mcmonigle wrote: ↑Thu Feb 24, 2022 5:07 pm I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion. Regardless, comparing Expositor to BBC borders on ridiculous given the above. Expositor uses distinct training code, inference code and a distinct topology whilst BBC copies SF 12's evaluation function verbatim.

BBC is a didactic engine in the spirit of VICE, accompanied by a series of YouTube videos explaining chess programming concepts. The author tries a number of experiments like “what if we graft the SF eval onto BBC?” or “what if we graft SF’s NNUE onto BBC?” If it’s only purpose was to release the strongest engine possible, then I might agree that these experiments are intellectually void, but it’s purpose is educational instead, as was his walkthrough of my own didactic mcts/nn engine, a0lite.

So much hate and anger.

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 5:48 pm

mvanthoor wrote: ↑Thu Feb 24, 2022 5:28 pm ...
You wrote that Maksim's results are the bare minimum one could expect when using a neural network which couldn't be further from the truth given Maksim copied Stockfish 12's highly optimized evaluation function verbatim (using the same weights/inference code).

You keep saying "evaluation function." Doesn't Stockfish 12's evaluation use a neural network already?
...

Correct.

mvanthoor wrote: ↑Thu Feb 24, 2022 5:28 pm ...
And, why would his results not be the minimum? He used an evaluation from a different engine, tacked on top of an engine that, using its own evaluation, managed only 2100 Elo. So he replaced his evaluation with SF's, without touching the search functionality, which was just basic alpha-beta, TT, null move, and maybe one or two other pruning techniques. Therefore I would expect that if you use a more advanced base (more search techniques, etc...) and a neural network optimized and trained for that base, the results could be even better than +850 Elo.
...

It's very unclear to me why you think this is the case. If I copied SF11's evaluation function (prior to NNUE) into a 2100 Elo engine, would it be reasonable to think that's the bare minimum result one could expect from using a hand crafted evaluation function for position evaluation? Obviously not as SF11's evaluation function is basically state of the art in terms of hand crafted evaluation functions. +850 Elo is more like the maximum possible Elo one could expect to gain (if it was grafted onto a stronger engine with more advanced search features, the Elo gain would be smaller due to Elo compression).

mvanthoor wrote: ↑Thu Feb 24, 2022 5:28 pm ...
True enough. In that regard, Expositor is more than a few steps ahead, because it at least uses its own code and architecture. However, to be able to generate your own data-set and train a neural network on it using your own engine's evaluation, you would need to have an engine with a HCE first; even if it is only a single set of self-written PST's on top of a bare alpha-beta search (like the very first version of my engine).

Your first data-set and neural network won't be strong, but it may be stronger than the PST-only HCE. Then you can replace the PST-evaluation with this network, and get into the "add search pruning feature / generate data-set / retune" loop, but that will be LOTS of work to train your engine in that way.

Therefore I prefer to first implement a complete HCE (and even there I fake it a bit, because I bootstrap the tuning of the tapered eval on a third-party data-set) and then make the jump to neural networks in, hopefully, a handful of training/tuning sessions.

A strong HCE is not required to produce training data for a neural network. SlowChess uses self-play training starting from a randomly initialized network for example. Seer uses a retrograde learning approach involving first training a network on EGTB WDL values and subsequently backing up the results to higher piece count positions by way of self-play play outs from N -> N-1 man positions.

Guenther · Post by **Guenther** » Thu Feb 24, 2022 5:48 pm

dkappe wrote: ↑Thu Feb 24, 2022 5:34 pm
connor_mcmonigle wrote: ↑Thu Feb 24, 2022 5:07 pm I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion. Regardless, comparing Expositor to BBC borders on ridiculous given the above. Expositor uses distinct training code, inference code and a distinct topology whilst BBC copies SF 12's evaluation function verbatim.
BBC is a didactic engine in the spirit of VICE, accompanied by a series of YouTube videos explaining chess programming concepts. The author tries a number of experiments like “what if we graft the SF eval onto BBC?” or “what if we graft SF’s NNUE onto BBC?” If it’s only purpose was to release the strongest engine possible, then I might agree that these experiments are intellectually void, but it’s purpose is educational instead, as was his walkthrough of my own didactic mcts/nn engine, a0lite.

So much hate and anger.

Where do you see all this 'hate and anger' ? I find your posts quite bizarre...

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 5:50 pm

dkappe wrote: ↑Thu Feb 24, 2022 5:34 pm
connor_mcmonigle wrote: ↑Thu Feb 24, 2022 5:07 pm I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion. Regardless, comparing Expositor to BBC borders on ridiculous given the above. Expositor uses distinct training code, inference code and a distinct topology whilst BBC copies SF 12's evaluation function verbatim.
BBC is a didactic engine in the spirit of VICE, accompanied by a series of YouTube videos explaining chess programming concepts. The author tries a number of experiments like “what if we graft the SF eval onto BBC?” or “what if we graft SF’s NNUE onto BBC?” If it’s only purpose was to release the strongest engine possible, then I might agree that these experiments are intellectually void, but it’s purpose is educational instead, as was his walkthrough of my own didactic mcts/nn engine, a0lite.

So much hate and anger.

I've no issue with Maksim nor BBC. I think BBC is a great didactic engine. Where is this proposed hate and anger?
My point was that comparing BBC and Expositor doesn't make a lot of sense.

expositor · Post by **expositor** » Thu Feb 24, 2022 6:37 pm

I would personally welcome also if you won't stay anonymous. (I didn't try hard though to check it)

My name is Kade ^_^

While I do value my privacy, this should be visible on the righthand side of my posts (unless I've misconfigured something) and you can actually see my name in the help and license information printed by Expositor's `help` and `license` commands.

I don't know how I feel about that. If chess programming were only about climbing the ELO ladder as fast as possible then there's no better way than to look at other open source engine's search-implementation and to add support for NNUEs asap. Especially for us programmers that don't really know enough about chess to write a hand crafted evaluation NNUEs are great. But on the other hand, in what way is the world (even if it's only the chess programming world) a better, richer place by there being yet another NNUE engine on the "market"? And aren't your own options to explore your own evaluation ideas very limited once you have the NNUE evaluation working because whatever you can come up with is probably going to end up worse?

This is not meant as a personal attack. It's more like a conflict of interests (wanting to have a strong engine vs something personal and unique to tinker with) that I find hard to resolve myself, too.

No worries! I didn't take it as a personal attack. It actually made me laugh a bit, because it's a bit ironic: there were many times I complained to @jtwright about strength being the be-all and end-all, and that we don't understand the effects of most techniques nearly as well as we should. (Why do singular extensions work so well for some but not others? What effect does, say, reverse futility pruning have on move ordering at cut nodes? at PV nodes? or does RFP interact with, say, multicut in a predictable way? Quantitatively, how much does this offset the gains from RFP? What classes of mistakes are associated with different formulas for LMR, and how strong or universal is that correlation? and many more.)

I'm still interested in those questions, which you can see in Expositor's `stat` and `trace` commands. But for myself, I realized that answering those questions was a lifetime endeavour, and that if I waited until I truly understood techniques – to the point I could confidently predict the effect a technique would have on strength, node count at depth, or any other statistic – before I used them, I'd never use them, and I'd never actually have a strong engine. And I did want to have a strong engine! because I wanted to be involved in TCEC, for example. I could just do so concurrently with the other things I was interested in.

In what way is the world better by there being yet another "HCE engine on the market"?

Basically, writing chess engines has been a thing for 60 years now. Most techniques in use have been known since the early 70's. There is no need to write your own chess engine except wanting to write your own chess engine.

This is essentially my take. I decided to write an ultra-sparse shallow network (to use a more descriptive term than "nnue") for Expositor because I knew hardly anything about neural networks in general and wanted to learn. Since that was my goal, I personally have no qualms with using the technique.

So I read the original nnue paper, talked with @jtwright, read the Wikipedia article on backprop, sat down with pen and paper and worked out the math so I'd understand it, read this article and a bunch of Q&As on the machine learning Stack Exchange... one day I was skimming this pdf and came across the diagram on page 201. I had to stare at it for a long while before the idea of bank switching finally clicked. It was then, finally, that I wrote the first version of Expositor's network and trainer.

I'm pretty sure that Expositor's current network architecture is actually nearly the same as Koi's (although not as wide and with an extra hidden layer). This is somewhat coincidental – crosscoupling the upper and lower banks between the input and first layer was something I had wanted to do but didn't think was possible, and seeing Koi do it anyway (or at least thinking that Koi did it anyway) is what triggered the aha moment (also the "I've been an idiot" moment). Perhaps that deserves an acknowledgment in the readme.

I hope there will be also a version later with NNs not trained by SF eval. The readme says this should happen sooner or later.

I agree that it's less than ideal that Expositor's network is trained on SF evaluations and I'd have preferred if the author had developed his own independent dataset. It's clear from the README the author is of the same opinion.

Yep! I personally have no qualms with training from Stockfish data, but I agree it's less than ideal.

The reason I started with Stockfish scoring is that I wanted something as close to the ground truth as I could find – that way there'd be one less confounding factor while I experimented with network architecture, network size, choice of gradient descent algorithm, &c. Intentionally giving Expositor a unique personality has (up to this point) been a secondary priority.

Regardless, I appreciate the author's transparency and Expositor seems an interesting engine. I'm excited to see how the author improves it going forwards. Congrats.

Thanks, Connor!

Let me say welcome and congratulations.

Thanks, dkappe!

First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor