First public release of Expositor

expositor · Post by **expositor** » Thu Feb 24, 2022 8:30 am

Hello everyone!

I'm really excited to share the first public release of Expositor, which you can find here on Github.

I probably ought to give some background, so what follows is an introduction for myself and a history of the engine.

I'm close friends with the author of Mantissa and we both began chess programming at the same time, a bit over a year ago. For me it's been a personal hobby – besides talking with @jtwright and sporadically running a Lichess account, I've not really been involved with the computer chess community. But my goals and reasons for chess programming have changed and slowly evolved over time, and I decided somewhat recently that I finally felt ready to engage with more people.

My first engine (retroactively named "Expositor 0") was a little program written in vanilla C. This was before I knew anything about computer chess, so the interface wasn't UCI but ASCII art in the terminal. It only valued material and mobility, searched 3 or 4 ply deep (up to 12 ply in a faux quiescing search), and didn't understand all of the rules, but usually beat me unless I was patient and careful (I'm not very good at chess ^_^). Expo 0 was somewhere in the neighborhood of 1000 to 1400 Elo on Lichess, I'd guess.

I then got a little more serious and wrote Expositor 1 (also in C). She was multithreaded, used variable-shift perfect hashing for move generation, spoke UCI, printed lots of debugging information and formatted tables, and routinely got to nominal depths of 7 or 8. I'm proud of the way her multithreaded search worked: threads started from children of the root node, not the root itself, and so they operated mostly independently; they would communicate cutoffs to each other by reaching into the α/β stacks of other threads while they were running and rewriting the α/β values (all this guarded by locks, of course). The idea is ludicrous, but totally worked. Expo 1 played in the 1600 to 1700 range.

Then I abandoned Expo 1 and began a different project. I wanted to explore some ideas related to the error distributions of evaluation functions and how error distributions changed throughout minimax trees, and to do that I needed to write some tools. I called this collection of tools "Admonitor" and it was, overall, much better written than Expo 1 (and written in Rust). Admonitor included a simple search, so naturally, I got curious and began playing him against his sister, Expo 1. Pretty soon, I was writing code for specific endgames and trying to improve move ordering and half a dozen other things, and Admonitor turned into something of a puzzle engine. Although he could beat Expo 1 handily, I was adamant about avoiding reductions and refused to consider pruning, which I felt violated the spirit of the project – in my mind, Admonitor was still a research tool that happened to be able to play chess, not a chess engine. That was a pretty severe restriction, and so Admonitor's rating capped out at 1900 or so.

Eventually, I relented and conceded the fact that I wanted to write a proper chess engine, and a strong one at that, and thus Expo 2 was born. Although I still wanted to derive thorough justification for some techniques (more justification than "it gains Elo") and avoid reading other engines' source, I realized that to get where I wanted I needed to be more humble and acquaint myself with the state of the art. @jtwright helped a lot with this – he's fantastically good at reading code and very quick, and a lot of my implementations came out of conversations with him. I owe him a lot of thanks, and also all of you: it's incredible what the chess programming community has accomplished and I am immensely glad that it values libre software and sharing ideas.

Anyway, that's the story. I'm happy to answer questions if anyone has any.

lithander · Post by **lithander** » Thu Feb 24, 2022 3:25 pm

I enjoyed the read but have to admit that the first part piqued my interest more then that later part, while you were probably expecting it to be the other way round. (E.g. Reaching into the α/β stacks of other threads...?!)

expositor wrote: ↑Thu Feb 24, 2022 8:30 am Eventually, I relented and conceded the fact that I wanted to write a proper chess engine, and a strong one at that, and thus Expo 2 was born. Although I still wanted to derive thorough justification for some techniques (more justification than "it gains Elo") and avoid reading other engines' source, I realized that to get where I wanted I needed to be more humble and acquaint myself with the state of the art.

I don't know how I feel about that. If chess programming were only about climbing the ELO ladder as fast as possible then there's no better way than to look at other open source engine's search-implementation and to add support for NNUEs asap. Especially for us programmers that don't really know enough about chess to write a hand crafted evaluation NNUEs are great. But on the other hand, in what way is the world (even if it's only the chess programming world) a better, richer place by there being yet another NNUE engine on the "market"? And aren't your own options to explore your own evaluation ideas very limited once you have the NNUE evaluation working because whatever you can come up with is probably going to end up worse?

This is not meant as a personal attack. It's more like a conflict of interests (wanting to have a strong engine vs something personal and unique to tinker with) that I find hard to resolve myself, too.

Gabor Szots · Post by **Gabor Szots** » Thu Feb 24, 2022 3:48 pm

lithander wrote: ↑Thu Feb 24, 2022 3:25 pm I enjoyed the read but have to admit that the first part piqued my interest more then that later part, while you were probably expecting it to be the other way round. (E.g. Reaching into the α/β stacks of other threads...?!)

expositor wrote: ↑Thu Feb 24, 2022 8:30 am Eventually, I relented and conceded the fact that I wanted to write a proper chess engine, and a strong one at that, and thus Expo 2 was born. Although I still wanted to derive thorough justification for some techniques (more justification than "it gains Elo") and avoid reading other engines' source, I realized that to get where I wanted I needed to be more humble and acquaint myself with the state of the art.
I don't know how I feel about that. If chess programming were only about climbing the ELO ladder as fast as possible then there's no better way than to look at other open source engine's search-implementation and to add support for NNUEs asap. Especially for us programmers that don't really know enough about chess to write a hand crafted evaluation NNUEs are great. But on the other hand, in what way is the world (even if it's only the chess programming world) a better, richer place by there being yet another NNUE engine on the "market"? And aren't your own options to explore your own evaluation ideas very limited once you have the NNUE evaluation working because whatever you can come up with is probably going to end up worse?

This is not meant as a personal attack. It's more like a conflict of interests (wanting to have a strong engine vs something personal and unique to tinker with) that I find hard to resolve myself, too.

I share your feelings Thomas.

Guenther · Post by **Guenther** » Thu Feb 24, 2022 3:58 pm

I hope there will be also a version later with NNs not trained by SF eval.

The readme says this should happen sooner or later

Code: Select all

HCE Bootstrapping
The neural network is currently trained from positions scored with Stockfish.
I'd like to write an evaluator that replicates the personality of early versions of Expositor,
train a network from positions scored by Expositor using that evaluator, then train another
network from positions scored using the previous network, and so on.

I would personally welcome also if you won't stay anonymous. (I didn't try hard though to check it)
OTH I am very positive about your transparency regarding your program.

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 4:14 pm

lithander wrote: ↑Thu Feb 24, 2022 3:25 pm ...
I don't know how I feel about that. If chess programming were only about climbing the ELO ladder as fast as possible then there's no better way than to look at other open source engines search-implementation and to add support for NNUEs. Especially for us programmers that don't really know enough about chess to write a hand crafted evaluation NNUEs are great. But on the other hand, in what way is the world (even if it's only the chess programming world) a better, richer place by there being yet another NNUE engine on the "market"? And aren't your own options to explore your own evaluation ideas very limited once you have the NNUE evaluation working because whatever you can come up with is probably going to end up worse?

This is not meant as a personal attack. It's more like a conflict of interests (wanting to have a strong engine vs something personal and unique to tinker with) that I find hard to resolve myself, too.

In what way is the world better by there being yet another "HCE engine on the market"? The reality is that the "HCE" (linear model over "handcrafted features") approach to evaluation functions is effectively antiquated and further exploration of such approaches simply isn't worthwhile. You seem to have some misconception that all engines using neural networks for position evaluation are necessarily homogeneous. The same "handcrafted" features used in HCEs can just as easily be used as input to a neural network and this approach is actually quite competitive as demonstrated by Winter. At the present moment, the most popular architecture (for those not just copying the SF 12/13 architecture) seems to be symmetric (invariant under (vertical mirroring * side to move flip) by way of the "half concept") 1 or 2 layer networks over PSQT features which is a very reasonable starting point much as "PSQT only" evaluation functions were a reasonable starting point for those interested in writing a "handcrafted" evaluation function. However, this wasn't always the case and the use of the "half concept" for such PSQT networks was only recently popularized with white relative networks over PSQT features historically enjoying greater popularity. This architecture, in turn, is no doubt far from optimal and I'm excited to see what further improvements are uncovered (the incorporation of mobility features and piece specific subnetworks are both exciting avenues for future exploration in my opinion). In fact, the author even mentions plans for such future experiments in the README.

Regardless, I appreciate the author's transparency and Expositor seems an interesting engine. I'm excited to see how the author improves it going forwards. Congrats

mvanthoor · Post by **mvanthoor** » Thu Feb 24, 2022 4:22 pm

lithander wrote: ↑Thu Feb 24, 2022 3:25 pm I don't know how I feel about that. If chess programming were only about climbing the ELO ladder as fast as possible then there's no better way than to look at other open source engines search-implementation and to add support for NNUEs. Especially for us programmers that don't really know enough about chess to write a hand crafted evaluation NNUEs are great. But on the other hand, in what way is the world (even if it's only the chess programming world) a better, richer place by there being yet another NNUE engine on the "market"? And aren't your own options to explore your own evaluation ideas very limited once you have the NNUE evaluation working because whatever you can come up with is probably going to end up worse?

I agree with this sentiment. However, there is a BUT... see below.

Maksim Korzh (writer of BBC and various other tutorial engines) has already proven that roughly 2950 Elo is basically the minimum you could expect. After he finished his engine BBC (roughly 2100 Elo), he replaced the hand-crafted evaluation with Stockfish's NNUE and hit 2950 Elo. (The engine doesn't seem to be in the CCRL-list anymore; other engines that used Stockfish's NNUE also seem to have disappeared.)

This means that adding NNUE to a very basic engine will easily net you 850 Elo. The development version of my current engine is at 2160 Elo, but this engine has much less features than BBC. Still, adding NNUE could boost this engine to just over 3000 Elo and I haven't even added any pruning yet. Other engines, such as the new Rebel 14, which adds NNUE on top of Fruit (2685 Elo for version 2.1) now sits at 3250 Elo, adding 560 Elo.

So if you want a strong engine, going the NNUE way is the 'easiest' way of obtaining this especially when using third-party datasets. Some people have called the newer engines "NNUE players" instead of chess engines, because they "play an evaluation/NNUE CD" on top of a move generator and alpha-beta algorithm, and lost interest in chess programming because of this.

Therefore I set myself the following goals with my engine:
1. Add a tapered evaluation to my engine (done in the current version)
2. Write a tuner and tune this evaluation on an existing data-set (in the process of doing this)
3. Add pruning and more search and evaluation features to the engine
4. At some point, generate my own data-set and retune the HCE
5. By iterating 3 and 4, I hope to reach 3000+ Elo on the CCRL-list, on ONE thread
6. Add Lazy SMP
7. Only then generate a data-set for training NNUE's
8. Write my own NNUE from scratch and add it to the engine

BUT... isn't using a third-party dataset for NNUE the same as using a third-party dataset for tapered HCE?

To some extent, it is.

I could have changed 1-3 around as well. To create the engine completely without third-party data-sets (which would be the most 'clean' solution IMHO), I would have needed to keep using my self-written PST's from version 1 of the engine and then make it stronger with more features and evaluation terms. Then I'd need to get it up to around 2500-2600 Elo, generate a dataset, and add a tapered evaluation on the basis of that dataset, hoping the result will be stronger. Add more features... regenerate a dataset, retune, add features, generate dataset, retune...

That would be a process that will take YEARS to complete, and the engine is already taking too long because I only write code for it in some evenings and the weekend. Thus I chose to bootstrap the tapered evaluation with a third-party dataset.

The difference is where you "bootstrap" the engine. You can bootstrap it at the lower end, using third-party PST's, or third-party datasets to get the first version of a (tapered) evaluation somewhat optimized so you don't have to do ALL the work from the very first beginning, regenerating and tuning over and over again. (Or worse, tinkering with the PST values yourself in the very first version of your HCE.) You can also bootstrap it at the high end by immediately including NNUE and having a huge third-party dataset scored by Stockfish.

So I can understand why people immediately go for a NNUE-engine with a third-party dataset, because it gets you a very strong engine from the start.

That brings me to one of the points Lithander mentioned: "How does such an engine make the world better?"

Basically, writing chess engines has been a thing for 60 years now. Most techniques in use have been known since the early 70's. There is no NEED to write your own chess engine... except....

wanting to write your own chess engine

So that is the reason why I opted to write a version without a tapered evaluation with my own PST's first, so I could go through the process of writing the basic engine from scratch. From there onward, I chose the lowest bootstrap point that uses the least amount of third-party stuff as possible (in my case, only one third-party data-set, as my engine is not strong enough yet to generate a meaningful data-set on its own), but saves the most work (generate/retune, generate/retune, ad nausea).

That is my (personal) problem with going for NNUE + third-party data-set + stockfish scoring from the beginning: you deprive yourself from writing a _chess engine_; instead you write the move generator and search function, and leave the "chess" part to the tuner, data-set, and Stockfish.

In the end my engine will probably (some day) include NNUE, but it will never lose its hand-crafted evaluation. It'll probably have both.

lithander wrote: ↑Thu Feb 24, 2022 3:25 pm I enjoyed the read but have to admit that the first part piqued my interest more then that later part, while you were probably expecting it to be the other way round. (E.g. Reaching into the α/β stacks of other threads...?!)

It seems interesting, but I can't fathom how this would work. Most engines nowadays use Lazy SMP, which means that each thread just does their own thing, and uses the TT for "communication" purposes. If one thread finds a cutoff, it'll be in the TT, and so other threads have access to this cutoff. You don't have to access one thread from the other. (It would also become a massive web of threads, because each thread will need to have a list of all other running threads.)

connor_mcmonigle · Post by **connor_mcmonigle** » Thu Feb 24, 2022 4:35 pm

mvanthoor wrote: ↑Thu Feb 24, 2022 4:22 pm ...
Maksim Korzh (writer of BBC and various other tutorial engines) has already proven that roughly 2950 Elo is basically the minimum you could expect. After he finished his engine BBC (roughly 2100 Elo), he replaced the hand-crafted evaluation with Stockfish's NNUE and hit 2950 Elo. (The engine doesn't seem to be in the CCRL-list anymore; other engines that used Stockfish's NNUE also seem to have disappeared.)

This means that adding NNUE to a very basic engine will easily net you 850 Elo. The development version of my current engine is at 2160 Elo, but this engine has much less features than BBC. Still, adding NNUE could boost this engine to just over 3000 Elo and I haven't even added any pruning yet. Other engines, such as the new Rebel 14, which adds NNUE on top of Fruit (2685 Elo for version 2.1) now sits at 3250 Elo, adding 560 Elo.
...

You seem confused about what's meant by "NNUE" (I really hate that term as it seems to be confusing everyone on this forum

). Maksim Korzh proved that grafting* Stockfish 12's evaluation function verbatim onto his weak BBC engine gained some 850 Elo (*by way of Daniel Shawul's probing library). That's not what the author of Expositor has done and is comparatively trivial as well as intellectually void. Expositor has its own unique training code, inference code, and a distinct architecture: https://github.com/expo-dev/expositor/b ... rc/nnue.rs.

mvanthoor · Post by **mvanthoor** » Thu Feb 24, 2022 4:38 pm

connor_mcmonigle wrote: ↑Thu Feb 24, 2022 4:14 pm In what way is the world better by there being yet another "HCE engine on the market"? The reality is that the "HCE" (linear model over "handcrafted features") approach to evaluation functions is effectively antiquated and further exploration of such approaches simply isn't worthwhile.

Maybe its not a field of chess engine programming that will yield revolutionary results anymore, but it does give one a thorough understanding of what needs to go into a chess engine evaluation. If you start with the NNUE evaluation from scratch, your engine is virtually a black box.

Personally I much prefer to write my own HCE first, and then swap over to NNUE, so I can at least know where the data and the network result come from, and that I completely understand those origins.

At the present moment, the most popular architecture...

However, this wasn't always the case and the use of the "half concept" for such PSQT networks was only recently popularized with white relative networks over PSQT features historically enjoying greater popularity.

This architecture, in turn, is no doubt far from optimal and I'm excited to see what further improvements are uncovered...

So tell me... where can I read about all of this; current and past developments? Where is all the information?

I asked some questions here in the past, but not very many. The reason is that I was able to find all the information online somewhere, sometimes even in texts written 25 years ago.

The Koivisto author(s) once said that "forums such as Talkchess" are not used for high-end engine developments anymore. So what is? Do I have to go and sit in IRC or Discord 24/7, in a 2022's version of "be there at the right time or miss out" computer chess gatherings as they were in the 70-90's? I refuse to do that. I don't have the inclination nor the time. When I get to a certain topic I want to study it at that moment. Old forum posts, the CPW-wiki, even Stackoverflow and news posts from the 90's can help with that. An IRC or Discord discussion from last week that may not even exist anymore doesn't help.

So when I finally get around to implementing a NNUE, most of the information may already be gone, if no-one bothered to collect it somewhere, in a comprehensive fashion. It's actually the reason why I'm writing stuff on my own site, trying to explain each concept used in my engine, with example code directly from the engine itself. I intend the site to be complete enough that, if I (or someone else) want to write another "classical" engine from scratch, all the subjects can be found there, explained in detail (what, why, how), accompanied by a heavily commented, working engine.

I'd love for something like that to exist for NNUE, but AFAIK, it doesn't. (It did, for a few engines in the past, but those sites are now gone or incomplete. There are some YouTube series with that same information though.)

mvanthoor · Post by **mvanthoor** » Thu Feb 24, 2022 4:43 pm

connor_mcmonigle wrote: ↑Thu Feb 24, 2022 4:35 pm
mvanthoor wrote: ↑Thu Feb 24, 2022 4:22 pm You seem confused about what's meant by "NNUE" (I really hate that term as it seems to be confusing everyone on this forum ).
So enlighten me. Stating that someone is wrong is easy.

Maksim Korzh proved that grafting* Stockfish 12's evaluation function verbatim onto his weak BBC engine gained some 850 Elo (*by way of Daniel Shawul's probing library). That's not what the author of Expositor has done and is comparatively trivial as well as intellectually void. Expositor has its own unique training code, inference code, and a distinct architecture: https://github.com/expo-dev/expositor/b ... rc/nnue.rs.
AFAIK, Maksim indeed tacked SF12's evaluation on top of BBC; and also AFAIK, SF12 uses NNUE... or at least, a neural network. I understand that Expositor uses its own code and architecture, and that is commendable. Still, it uses third-party data-sets and Stockfish evaluations to create the neural network it uses. This effectively puts Stockfish knowledge into Expositor's network.

dkappe · Post by **dkappe** » Thu Feb 24, 2022 4:52 pm

Wow, an engine announcement and already a pointless food fight.

Let me say welcome and congratulations.

First public release of Expositor

First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor

Re: First public release of Expositor