A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, Dann Corbit, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Post Reply
AndrewGrant
Posts: 871
Joined: Tue Apr 19, 2016 4:08 am
Location: U.S.A
Full name: Andrew Grant
Contact:

A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by AndrewGrant » Tue Sep 29, 2020 7:33 am

Two years and some months ago, Alpha Zero dropped their initial paper, claiming to have thrashed Stockfish by a damning margin. People were quick to take a side. After the dust settled, I think most agreed that Alpha Zero's conditions were inane, and the the result was not indicative of a shift in the times. In the same time, Leela entered the scene. Over the last two years Leela has stayed close to Stockfish in strength, but has increasingly reduced the hardware required to do so.

I was never a fan of the Leela projects appearance on the scene. I thought, and still think, that GPUs vs CPUs is an unfair comparison. Of course, time passed and the Leela team managed to reach similar strength, but without the use of GPUs -- a total repudiation of my original stance. Anyway, more so than Leela, I was concerned about a new wave of engines, all built upon the work of the Leela project. I expected to see a dozen engines, all with only slight variance and nuance added. Blessed, this never came to be. Leela and Allie are, to me, twins. The other NNs out there don't bare the same relation to the Leela projects. They have different trainers, or different datasets, or different structures, or different back-ends. I think, and I hope, there is something special about each of them.

Now, in recent months, a similar series of developments are happening. NNUEs, or rather, I want to be specific, the King-Piece structures used in Stockfish, are flooding the scene. With a few hours work, an author can copy paste some code from CFish or Stockfish, add the incremental update code, and with some debugging they can begin playing games with Stockfish's Networks. In the last month a half dozen authors have posted their results from using the Stockfish structure/networks -- almost all +200 elo or more.

For the most part in Computer Chess, specific ideas and implementations are not transferable. Sure, we all have the same NMP, LMR, FMP, Probcut, Singular Extensions techniques, but in search if you try a patch pushed to Stockfish this week, it will likely fail. An even better example is evaluation. Evaluation functions are so specific to engines, so fine tuned to the already existing ideas, that virtually all attempts to take an idea from another engine, without significant changes or reworking, is futile.

So its odd, or to me it is, that someone can plug Stockfish's Network files into their own engine, and share in the same success. In fact, its not odd, its concerning. One can treat the Network file as (I could be wrong on the math, I don't know the structure) nearly, if not over, a million tiny evaluation terms. So I ask myself this: I cannot copy idea X from Stockfish's eval into Ethereal. But if I copy a million weights from Stockfish, then my evaluation is so similar to Stockfish, that I then become able to take X.

Well so, obviously Stockfish does not have global claim to the idea of NNs using King-Piece inputs. Its essentially a giant PSQT input, with King-Piece crossing. No one here has the rights to PSQTs -- everyone has one, everyone uses one, no one bats an eye.

----

I worry about the future of Computer Chess. I see a timeline where a dozen engines use something very similar to Stockfish's methods. They all shoot up in elo. New developers, people working on their own innovations, are disheartened. They ask themselves why they toil away on their new ideas and tweaks, when one can just embrace the NNUE and be on an equal playing field with the top tiers of engines. So they leave. I leave. Alayan leaves. Many others leave. The result is that there are two engines left; Stockfish, and Leela. I don't find that interesting. Maybe others do I suppose.

At the same time, this is me jumping up and down, waving my hands, saying "Hey, I've been working on Ethereal for 60hrs a week for the last 6 years. I've done all this work, spent all this time and energy. But now if you want to play at Stockfish's level, you just need to download the training code, feed in your evals, wait a few billion clock cycles, and presto, its done. Why should I bother?"

I released a tuning paper a few months ago. It was the culmination of a years effort on various implementations, as well as likely over a hundred different methodologies for building datasets. I shared that paper, and I shared pieces of the Ethereal data, about 10 million positions at a time, to all those who asked. I think, two years ago, this would have made a splash. In fact, I still believe that someone could perform the same exercise as I outline, and gain +15 elo or more to Stockfish's static evaluation (pre-NNUE). But now my work is futile?

I built an open source framework that mimics Fishtest, but works for many engines at a time. We support engines of all types. C, C++, Rust, Java, and virtually anything you can compile on two platforms. With the help of Noobpwnftw, we hooked up machines and built a framework for authors to work on their own projects -- but share with others at the same time. I run my tests, and others can see them and tinker. Others run their tests, and I can see them in tinker. It was a venture in facilitating a greater exchange of ideas in Computer Chess. A venture in promoting stronger, but nuanced and diverse engines. But now my work is futile?

I feel that soon it will become clear that I've spent six years to do nothing. Ethereal, unless I too copy paste the NNUEs, will be tossed out and placed on the dustbin of history. I never expected to be at the top -- but I got close. At a time, the 5 strongest engines in the world were {Stockfish, Komodo, Houdini, Fire, Ethereal}. We learned that Houdini and Fire were stolen goods. So at a time Ethereal was #3 -- but still far far far from #2. I could make progress towards tackling Komodo, while gaining ground on the rest of the field. Prior to two months ago, if you asked me, I would tell you that Ethereal would surpass Komodo in two years. Now I can't say that. I won't use Stockfish's methods. I'm adding my own NN ideas to Ethereal, but it bores me. I'm praying that one can beat NNUE King-Piece networks with some brilliant architecture. But my hopes fall each day.

I could be all wrong. I could be out of touch. But if I'm not, then the future of computer chess, the future of unique and diverse engines, depends upon all of us, as individuals, to encourage and promote new ideas while discouraging those who take from Stockfish without trying their hand at the problem. I'm already concerned when I see engines with Stockfish nets being placed onto rating lists.
Last edited by AndrewGrant on Tue Sep 29, 2020 7:59 am, edited 2 times in total.

Raphexon
Posts: 341
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by Raphexon » Tue Sep 29, 2020 7:56 am

Now, in recent months, a similar series of developments are happening. NNUEs, or rather, I want to be specific, the King-Piece structures used in Stockfish, are flooding the scene. With a few hours work, an author can copy paste some code from CFish or Stockfish, add the incremental update code, and with some debugging they can begin playing games with Stockfish's Networks. In the last month a half dozen authors have posted their results from using the Stockfish structure/networks -- almost all +200 elo or more.
The less disheartening part is that none of the authors have acted like hooking up Sergio's nets to their engine is their own work.

Minic author was very clear that Minic + Sergio NNUE is not Minic. And is currently working on his own implementation.
Rubi's author seemed disheartened and was clear that Rubi+SVnet is not Rubi.
Orion's author was annoyed by CCRL testing his engine with Sergio's net: viewtopic.php?f=6&t=75224

Igel's author is also working on his own implementation, and is currently using a net from Dkappe that mimics his Bad Gyal nets.
Pedone's author also seemed to be working on his own implementation.

Slowchess's author is working on a different NN implementation.

So while I agree that the situation is worrying, I'm not seeing any abuse yet.
With Fire and Houdini being outed I think people have also become a lot more skeptical and will more quickly spot clones in the future.

IQ
Posts: 162
Joined: Thu Dec 17, 2009 9:46 am

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by IQ » Tue Sep 29, 2020 8:30 am

AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
So its odd, or to me it is, that someone can plug Stockfish's Network files into their own engine, and share in the same success. In fact, its not odd, its concerning. One can treat the Network file as (I could be wrong on the math, I don't know the structure) nearly, if not over, a million tiny evaluation terms. So I ask myself this: I cannot copy idea X from Stockfish's eval into Ethereal. But if I copy a million weights from Stockfish, then my evaluation is so similar to Stockfish, that I then become able to take X.
You can "copy" ideas, by reimplementing them in Ethereal. I think everybody agrees that this is perfectly fine and does not violate any "anti-clone" rule. Most here only object to copying code verbatim without adhering to the open source license models. Ideas are fair game, and if SF introduces a new eval term you are free to implement that in Ethereal.
AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
I worry about the future of Computer Chess. I see a timeline where a dozen engines use something very similar to Stockfish's methods. They all shoot up in elo. New developers, people working on their own innovations, are disheartened. They ask themselves why they toil away on their new ideas and tweaks, when one can just embrace the NNUE and be on an equal playing field with the top tiers of engines. So they leave. I leave. Alayan leaves. Many others leave. The result is that there are two engines left; Stockfish, and Leela. I don't find that interesting. Maybe others do I suppose.
Thats just the normal cycle of innovation. We have been there with Null-Moves, LMR, Tablebases etc. In fact I would say the opposite is true: Computer Chess was stagnant with basically everybody using more or less the same search, same general eval terms etc. Just that SF through its testing framework and support by many developers had a more rigerous and sound methodology to gain elo by incremental improvements. When the null-move heuristic first made in impact on the scene it also gained 70-150 Elo and soon after everybody was using it. There is so much more to be understood about the use of NN in chess engines in regard to structure, learning, size vs. speed tradeoffs, a/b vs. UCT, that i see no reason to quit in one of the most exciting phases of computer chess.
AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
At the same time, this is me jumping up and down, waving my hands, saying "Hey, I've been working on Ethereal for 60hrs a week for the last 6 years. I've done all this work, spent all this time and energy. But now if you want to play at Stockfish's level, you just need to download the training code, feed in your evals, wait a few billion clock cycles, and presto, its done. Why should I bother?"
Of course this all is a deeply personal decision. While i feel that the commercial viability of chess-engine development is as such that a full-time commitment is not really the smartest thing to do, I fully respect the effort put into a labor of love. But think about this for a moment: You entered the competive engine dev arena at a time where the established methods were already stretched close to their limit. It is a testament to your skills and effort that you managed to catch up to the leading engines quickly competing with devs (and teams of devs) who had vastly more experience in these established methods. Now the playing field is much more level, these new NN based methods are new for everybody, putting your effort and skill to use here and now on these new methods might allow you to find a lot of new approaches which might allow you to leap-frog much of what the others are doing. Maybe you come up with a change to the NNUE structure to be better suited for chess, or you find a better learning regime, or you develop a UCT-A/B Hybrid suited for larger networks, or you come up with a multi-net approach for the middle and/or endgame.

And now you want to quit? Seems irrational to me.
AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
I feel that soon it will become clear that I've spent six years to do nothing. Ethereal, unless I too copy paste the NNUEs, will be tossed out and placed on the dustbin of history. I never expected to be at the top -- but I got close.
You produced a chess-playing engine. That's not nothing. You probably learned a lot, competed with the best, had your successes and dissappointments. Be proud of that.
AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
I could be all wrong. I could be out of touch. But if I'm not, then the future of computer chess, the future of unique and diverse engines, depends upon all of us, as individuals, to encourage and promote new ideas while discouraging those who take from Stockfish without trying their hand at the problem.
Just embrace the new methodologies. Just like in science there will be paradigm shifts from time to time. First there is reluctance to even acknowledge the new ways, then when the progress is undeniable the self-doubt sets in, but ultimately as its new for everybody it also provides new opportunities. Just join the party and contribute. Ethereal will get better with NNUE type networks, you will learn a lot of new stuff and most likely find some new ideas to push it even further.

User avatar
Graham Banks
Posts: 34652
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by Graham Banks » Tue Sep 29, 2020 8:44 am

I would only be interested in testing engines with NNUE if they used their own net.

At present, I think that only Igel, Nemorino and Minic do this?
gbanksnz at gmail.com

User avatar
mvanthoor
Posts: 546
Joined: Wed Jul 03, 2019 2:42 pm
Full name: Marcel Vanthoor

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by mvanthoor » Tue Sep 29, 2020 8:48 am

Hi Andy,

Your post couldn't be more true, and the one thing you are actually asking yourself in it is this:

"Is writing a chess engine a science, an art, or a bit of both?"

With the old alpha-beta engines, it was a bit of both: the computer science was in the search function and the move generator. First, min-max. Then alpha-beta. Then add a transposition table. Then null move pruning, and so on. For the move generator one can go with a simple array implementation, or magic bit boards. If you were around in the 70's to mid-2000's, one could have lived through all the development of this science. Even though it looks like 'everything' has been done, now and again a new technique pops up.

I don't mind if engines use each other's search techniques. Basically, board representation and search algorithms have NOTHING to do with chess; they only have an impact on speed and how deep the engine can see into a position.

Given exactly the same search capabilities, the differentiating part between two engines is the evaluation function. This makes up the personality of an engine. It encodes literal chess knowledge: put your rooks on open files. Bash a knight into a hole. Attack the backward pawn.

This is the 'art' part of writing a chess engine.

If you change something in the search function, your engine might get a bit faster (or slower if your idea doesn't work), but if you change anything in the evaluation function, even the tiniest thing, the engine can play VERY differently. Look at a simple engine such as TSCP. It tries to keep its own pawn structure healthy, while creating double and triple pawns for the opponent, it creates weak squares, and puts all it's money on creating a defended past pawn. Many engines that are stronger than TSCP lose games to TSCP because of lack of knowledge.

For good or ill, it has a distinct personality. If you show me 20 games, 5 of which are played by TSCP, I can probably pick them out.

One can *UNDERSTAND* TSCP. I am sure that, after I finish my own engine, I'm able to understand what it plays and why, because I know what evaluation functions it has. Because I'm a somewhat decent chess player (in my teens my Elo-rating was +/- 1850, sometimes with peaks up to 2000), I have enough knowledge to program a decent evaluation function.

The chance is that the engine will try to play like me. I have a penchant for playing tactical chess, and sometimes making moves that intentionally imbalance and complicate games. (I like Tal's games, for example... and dislike Karpov's.) I'm sure that some of that personality will be in my engine. That's the reason I write this engine: to be able to go through the entire process of crafting the board representation and search myself to really understand how the algorithms work, and THEN to be able to craft the evaluation function to my liking, and also, to try and get it to play as strong as possible. That is done by analysing games, seeing the engine's mistakes, then try to "teach" it something new, and have it play the same opponents again.

For me, that is what makes writing a chess engine fun: the promise that I can start "teaching" it to play better and better chess and seeing it improve against its opponents because of it.

If my understanding is correct, NNUE replaces the evaluation function.

The author of Rubichess has already said something in different words: If you build a NNUE in your engine, the engine becomes just a player for the NNUE file. If it is indeed the case that NNUE replaces the evaluation function, then if you have two engines with the same search capabilities, then putting the same network into both will make them play (almost) the same.

That is not interesting.

In the 80's and 90's, if you were into computer chess, you could actually *recognize* players in a computer chess game, just as you could recognize Kasparov or Lasker in a human OTB game. You could actually see The King playing against Chess Genius, or Fritz against Shredder, and recognize the styles of the different engines.

That is fun.

It is the reason why the author of ICE says on his site: "ICE is still a CLOP-free zone." What does this have to do with neural networks? Everything: Clop, Texel tuning, and certainly neural networks remove the hand-crafted evaluation function and replace it by a huge batch of numbers and weights. The author of Rubichess has already said this: if two engines have equal search capabilities, and you add NNUE, they just become a player for the NNUE file, just like a CD-player plays a CD. The evaluation function, in the past painstakingly developed by the chess engine author, is now reduced to being created by another computer program/training process. Nobody knows what the engine is doing; it plays clinically and mathematically, without any personality.

The entire problem is:

Tuning and neural networks are taking the 'art' part out of computer chess, and if you take that away, the only thing that is left are algorithms and numbers... and that doesn't have anything to do with chess anymore. That is the reason why you feel like: "Why did I do ALL of this work, if I can just plunk in a NNUE and be done with it?" That is logical: the board representation, the search, the bit board move generator... they are all interesting with regard to computer science, but in the end they are all there as a vehicle for the evaluation function to play chess. Replace the evaluation function with an automatically tuned part, and you as a human have no reason to write a chess engine any longer.

As I always say to people: Star Trek has an episode about this. And that is true in this case as well:

Voyager: Virtuoso

I don't know if you are into Star Trek, but Voyager has a holographic doctor, who tries to become more human. (Just like Data, in The Next Generation.) In this episode, the doctor performs as a singer. He teaches himself to sing, by *crafting* a singing subroutine on top of his algorithms (just like we make a chess engine play actual chess by crafting an evaluation on top of search algorithms.) Then voyager meets a devoutly mathematical species, and they are fascinated by the doctor.

They create their own hologram (search function...) and then replace the doctor's singing routine (evaluation...) with a mathematically optimized matrix (neural network...), taking the 'art' out of what the doctor tried to do. The result is not pretty: mathematically, the new hologram sings 'better' than the doctor, but as the Voyager crew doesn't live their lives through pure mathematics like that other species, so nobody understands what the new hologram is doing, and it sounds terrible to them.

With chess engines, we're at the same point. We're taking the last bit of art out of the programming of an engine, and what's left (algorithms and numbers) is interesting only to computer scientists and high-level mathematicians, but not to chess players who are also programmers.

Therefore I'm strongly considering to have TWO evaluation functions in my chess program, at some point: one that is written by me, for personality and playing against myself (after the program gets a level setting obviously), and a tuned evaluation for computer opponents. I might never implement a neural network/NNUE, or only as an option. (I seem to remember that DanaSah does something like this already; if I remember correctly, it has an 'opponent' setting with 'human' and 'computer' options.)
Last edited by mvanthoor on Tue Sep 29, 2020 8:58 am, edited 1 time in total.

User avatar
xr_a_y
Posts: 1337
Joined: Sat Nov 25, 2017 1:28 pm
Location: France

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by xr_a_y » Tue Sep 29, 2020 8:48 am

Let me copy paste here what I wrote on discord a month ago.

Well, most of Minic is already coming from very classic ideas of the chess engine community but i'm trying to understand, try, test, everything, and hopefully learn things in that long process. This is what my journey in chess programming is since fall 2016. At rare occasions I propose something somehow new, but this is the exception. With NNUE, something else happened. I know very little about NN, I was in vacations and stuck at home due to very hot weather, so I made an experiment. I copy/paste SF code and in maybe 8 or 10 hours of work is was working with crazy Elo boost, putting MinicNNUE (a.k.a Minnuec), with almost no effort from my side, at the level of engines for which authors dedicade years of their life and made all the community progress. I am thinking of Jon, Andrew, Larry, Ronald, Milos, and many others of course. From my point of view, it will be unfair regarding those authors hard work. Chess engine community has always amazed me by its capacity of sharing ideas, be helpfull, teaching, ..., with so much open source engines, forums, discord channels, ... Copying each other, trying others ideas in other contexts, is a part of our collective strenght. This is not a competition to me. I guess in fact every engine authors is contributing for the whole group and our representants todays are the top engines which aggregate all the knowledge. But it won't be fun enough if we all just contribute directly in SF or Leela.
At some point, it is very possible that many engines switch to NNUE, build their own nets, maybe with their own architecture. This will be a good thing probably, and will again benefit to the whole community. I hope this will be done in some sort of common NNUE framework. At this point, i'll reconcider this question and decide if I go deeper in the NNUE thing.

jp
Posts: 1411
Joined: Mon Apr 23, 2018 5:54 am

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by jp » Tue Sep 29, 2020 8:51 am

AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
Of course, time passed and the Leela team managed to reach similar strength, but without the use of GPUs -- a total repudiation of my original stance.
Can you clarify this? What is the strength of Leela-CPU?

On CCRL, I see what I guess is Lc0 without GPU:
Lc0 0.26.2 w703810 64-bit 3151.

Do you mean similar in strength to
Ethereal 12.50 64-bit 3300
(from the same list)?

I'm not sure whether people will think within 150 elo is similar in strength.

Florentino
Posts: 38
Joined: Tue Mar 25, 2014 9:34 pm
Contact:

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by Florentino » Tue Sep 29, 2020 9:03 am

Hello Andrew, hello IQ,
I share Andrew's concerns. The exciting part of chess programming is gone. Analyzing your engine's weakness and using your brain to improve your engine's evaluation is over. This is automated now. But is this so much different than all the tuning we did before?
For me the fun was thinking about tuning methods, developing a tuner (resp. experimenting with tuners from others). But tuning itself is not interesting at all (and so although I spend quite some time with creating tuning code, Nemorino is more or less untuned).
And with the neural networks it's more or less the same. The exciting part is trying to understand how NNUE works - creating network weight files is boring. But there are still a lot of interesting challenges:
  • Creating an own learner implementation
  • Trying to improve the network architecture
  • Getting a better support for Chess960
  • ...
So no need to stop chess programming :D, however I'm afraid, the ratio between the exciting and the boring parts will get smaller

Raphexon
Posts: 341
Joined: Sun Mar 17, 2019 11:00 am
Full name: Henk Drost

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by Raphexon » Tue Sep 29, 2020 9:10 am

jp wrote:
Tue Sep 29, 2020 8:51 am
AndrewGrant wrote:
Tue Sep 29, 2020 7:33 am
Of course, time passed and the Leela team managed to reach similar strength, but without the use of GPUs -- a total repudiation of my original stance.
Can you clarify this? What is the strength of Leela-CPU?

On CCRL, I see what I guess is Lc0 without GPU:
Lc0 0.26.2 w703810 64-bit 3151.

Do you mean similar in strength to
Ethereal 12.50 64-bit 3300
(from the same list)?

I'm not sure whether people will think within 150 elo is similar in strength.
Another recent test as long TC:
http://www.fastgm.de/60min.html

CPU-Leela also scales almost linearly with cores so that doubling of cores = doubling of time.
So in the TCEC machine she's a lot stronger than single (or quad) core rating lists make you assume.

chrisw
Posts: 3851
Joined: Tue Apr 03, 2012 2:28 pm

Re: A Crossroad in Computer Chess; Or Desperate Flailing for Relevance

Post by chrisw » Tue Sep 29, 2020 9:11 am

More to the point, imo, is why would anyone actually want to 'release' an engine?

If it's to get to the top of a rating list, well, one can discover that for oneself, at home, by testing it oneself.
If it's to make money, well, forget that nowadays.
If it's to get fame, well, stuff that. You'll have it until the next big thing comes along and then not.

But what can happen, and does happen, is that a giant storm can be brewed up on computer chess forums that your engine is copied/cloned/steals ideas/is immoral, all manner of attacks, highly stressful for the engine programmer at the other end of it, and for what? For giving your engine free to people to play with. Seems to me the downside risk massively outweighs the actually non-existent upside. For me, I came back serious bigtime in COVID crisis, I can't travel, I get quarantined if I return to UK, so I switched from tinkering on and off and plunged into full time new engine, full of new ideas, old ideas, original ideas, non-original ideas and I managed to get my nice Tal-ish style into it, up to a point, still working on that. I was never too sure what I'ld do with it, but given that I'm a target for some malicious people since way back, and there's a new round of witch-hunting started up, I am pretty much coming to the view that I keep the thing completely private. 'Community' is all very well, but it's also contains some who are all too ready to get nasty. Something to do with computer chess, it attracts some fine people, but always, I find anyway, it attracts some with a mean and nasty streak. If those predominate and there have been times when they do, its not a pleasant place at all.

Post Reply