Andscacs 0.1 with NN file from Coiled ...

JohnWoe · Post by **JohnWoe** » Thu Mar 10, 2022 10:59 pm

connor_mcmonigle wrote: ↑Wed Mar 09, 2022 5:13 pm
JohnWoe wrote: ↑Wed Mar 09, 2022 10:52 am When it comes to Mayhem. Only Polyglot stuff is copy-pasted from Stockfish. Because.
Drum roll start ...
No point reinventing the wheel!
Drum roll end ...

When you implement a beautiful unique NNUE. Then support for all the boring stuff: threads, ponder, syzygy EGTB... Then I have reinvented Stockfish. That's boring.
You link against the probing code from CFish and use weights from Stockfish effectively copy pasting the entire evaluation function as has been explained previously. If you can't see any value in experimenting with new techniques, then perhaps best is to just not write a chess engine. Stockfish will almost certainly always be stronger. Why are you reinventing the wheel?

NIH is only valid for highly specified problems with clearly optimal solutions. Several authors use external move generation libraries, EGTB probing code, polyglot probing code, etc. These are all problems to which NIH is applicable as a principle. The question of how to best map a chess position to a scalar is clearly not such a problem.

I couldn't care less.
I don't link against anything. No compilation units.

btw I invented pawn promotion extension and added to my engine. It was worth something. Push to 6th/7th wasn't worth much. The next day pawn promotion extension was added in SF as well.
That's how it goes.

I invented LazySort() and it improved many programs. And that's good.

Calling syzygy perfect and no room for improvements? It's the best as of now. But perfect? You have to copy-paste 2000 lines of some C code. Full of memcpy(). (Hopefully matched with free()). As no RAII as in C++. And fish probing code out of Glaurung. That's perfect and set to stone solution to EGTB???

Could you easily improve polyglot/syzygy? Yes. Is that worth of your limited time on Earth? No!

Take my Eucalyptus KPK C++ header. That's like 5 lines of simple code. Drop into any project. Perfect KPK play forever.

connor_mcmonigle · Post by **connor_mcmonigle** » Fri Mar 11, 2022 2:17 am

JohnWoe wrote: ↑Thu Mar 10, 2022 10:59 pm
connor_mcmonigle wrote: ↑Wed Mar 09, 2022 5:13 pm
JohnWoe wrote: ↑Wed Mar 09, 2022 10:52 am When it comes to Mayhem. Only Polyglot stuff is copy-pasted from Stockfish. Because.
Drum roll start ...
No point reinventing the wheel!
Drum roll end ...

When you implement a beautiful unique NNUE. Then support for all the boring stuff: threads, ponder, syzygy EGTB... Then I have reinvented Stockfish. That's boring.
You link against the probing code from CFish and use weights from Stockfish effectively copy pasting the entire evaluation function as has been explained previously. If you can't see any value in experimenting with new techniques, then perhaps best is to just not write a chess engine. Stockfish will almost certainly always be stronger. Why are you reinventing the wheel?

NIH is only valid for highly specified problems with clearly optimal solutions. Several authors use external move generation libraries, EGTB probing code, polyglot probing code, etc. These are all problems to which NIH is applicable as a principle. The question of how to best map a chess position to a scalar is clearly not such a problem.
I couldn't care less.
I don't link against anything. No compilation units.

btw I invented pawn promotion extension and added to my engine. It was worth something. Push to 6th/7th wasn't worth much. The next day pawn promotion extension was added in SF as well.
That's how it goes.

I invented LazySort() and it improved many programs. And that's good.

Calling syzygy perfect and no room for improvements? It's the best as of now. But perfect? You have to copy-paste 2000 lines of some C code. Full of memcpy(). (Hopefully matched with free()). As no RAII as in C++. And fish probing code out of Glaurung. That's perfect and set to stone solution to EGTB???

Could you easily improve polyglot/syzygy? Yes. Is that worth of your limited time on Earth? No!

Take my Eucalyptus KPK C++ header. That's like 5 lines of simple code. Drop into any project. Perfect KPK play forever.

Ahh, I see you include the CFish NNUE code header file rather than linking against it, though I don't see why that makes any difference with respect to our discussion.

In any case, you seem to have misinterpreted my point. Syzygy is perfect insofar as probing it gives you the objective game theoretic WDL value of endgame positions. Any correct implementation will do the same so there's not much to be gained in reimplementing it (not to mention it's worth less than 10 Elo). Likewise, move generation is also a well defined problem and any correct implementation will work in a top engine so there's again not much to be achieved in reimplementing it (though doing things for one's self can have some intrinsic value).

In each of the above cases, there's a well defined computable predicate over candidate solutions which gives whether an implementation is correct or incorrect. That's not the case for the evaluation function, search or chess engines in general. These aspects of chess engines remain subjective (for the time being) and permit creativity and differentiation.

Obviously there's no harm done in copying Stockfish's evaluation function so long as you're honest about it, but it definitely reduces the extent to which you can claim your engine is interesting/unique.

Ovyron · Post by **Ovyron** » Fri May 13, 2022 3:33 pm

connor_mcmonigle wrote: ↑Fri Mar 11, 2022 2:17 am Any correct implementation will do the same so there's not much to be gained in reimplementing it

The same is true about NNUE, nothing is gained from reinventing the wheel, nothing is gained from creating own implementation and creating own net. There's that path of doing those things, or half those things and the path of copying SF and net, or copying half of them, or avoiding NNUE altogether and staying with classical evals.

An there's the testers that choose the paths to only test original eval, and others that only test original implementations, and others that only test original nets, and others that only test strong engines, and others that test everything.

All that's all fine and dandy, what is not is the Owners Of Decency dictating that there's some correct paths and that there's some wrong paths and that the authors that choose wrong paths don't deserve to have their engine tested.

I think why people do all this has been forgotten, I've known people that avoided all this by keeping their engine private, and only shared them with their friends, and I was from the very few lucky people that had some of them. But in the end most people don't have valuable things just because others would have critiqued them if they were released publicly.

Apparently the best thing to do is do whatever pleases yourself and don't care about critics' opinion, and release everything you want, there will be people that will enjoy it and will not care about the paths to get there. Now that the gates [of having engines use Stockfish's evals] are open I wish to see some engines' searches with other engines' evaluations, as this will create new chess entities and who knows if a great one is hiding in plain sight, and we could enjoy it if people stopped caring so much about their pocket universes.

connor_mcmonigle · Post by **connor_mcmonigle** » Sat May 14, 2022 8:53 pm

Ovyron wrote: ↑Fri May 13, 2022 3:33 pm
connor_mcmonigle wrote: ↑Fri Mar 11, 2022 2:17 am Any correct implementation will do the same so there's not much to be gained in reimplementing it
The same is true about NNUE, nothing is gained from reinventing the wheel, nothing is gained from creating own implementation and creating own net. There's that path of doing those things, or half those things and the path of copying SF and net, or copying half of them, or avoiding NNUE altogether and staying with classical evals.

An there's the testers that choose the paths to only test original eval, and others that only test original implementations, and others that only test original nets, and others that only test strong engines, and others that test everything.

All that's all fine and dandy, what is not is the Owners Of Decency dictating that there's some correct paths and that there's some wrong paths and that the authors that choose wrong paths don't deserve to have their engine tested.

I think why people do all this has been forgotten, I've known people that avoided all this by keeping their engine private, and only shared them with their friends, and I was from the very few lucky people that had some of them. But in the end most people don't have valuable things just because others would have critiqued them if they were released publicly.

Apparently the best thing to do is do whatever pleases yourself and don't care about critics' opinion, and release everything you want, there will be people that will enjoy it and will not care about the paths to get there. Now that the gates [of having engines use Stockfish's evals] are open I wish to see some engines' searches with other engines' evaluations, as this will create new chess entities and who knows if a great one is hiding in plain sight, and we could enjoy it if people stopped caring so much about their pocket universes.

There are definitely things to be gained in experimenting with new network architectures and implementations. This is how progress is made and differentiation is achieved. Your comment is rather nonsensical.

Most testers have stated they're uninterested in testing engines which copy Stockfish's evaluation function which shouldn't exactly come as a surprise. Who wants to test a bunch of engines which evaluate positions identically? Testers/users are free to do as they please and, likewise, authors are not entitled to have their engines tested.

Frank Quisinsky · Post by **Frank Quisinsky** » Sat May 14, 2022 11:40 pm

Hi Conner,

so the "Most testers" are wrong here if ...

1. Playing style of engines stands in the foreground! An other situation if "Playing Strength" of engines stands in the foreground. I can't see that playing style of engine is changed with eval from Stockfish. Elo is from my view in the last 10-15 years not on rank 1. The style of engine is on rank 1, the features of engines on rank 2 and the strength of engine on rank 3. Most of programs are today stronger as chess grandmasters. So why for a human the playing strength of engine should be on rank 1. I can't see any reason for it.

2. For the rating list itself without any meaning. In around 2 months I have enough games and programs / versions in my own. You can delete the NNSf (Network by Stockfish) or NNRe (Network by Rebel) engines and you will not see any big changes in Elo for all the non NNsf engines inside a list. Very easy to do that, my complete database is online.

3. NNSf is important for some other reasons!
Programmers can test a bit before they create an own network. How strong the engine can be is the question for programmers and testers (people have interesting on it). That is animation for programmers! A good example is the programmer of Caissa. The Caissa programmer made from version 0.4 to 0.5 200 Elo + with the same SF network and is working on an own. After all I understand the programmer of Rodent is working on an own network. We can see the playing strength with Network from Stockfish and Network from Rebel.

4. If a tester ignore NNSf ... the work by 10 or more programmers is lost. With the final result that a "group formation" splits the scene. Group formation in "GitHub" times is for some reasons contra productive.

Can give much more reasons ... not today in the evening.

The other site of the medal ...

1. A lot of programmers are working very hard on an own network. Each Elo comes with a lot of testing and this is a very hard work. To have the NNSf / NNRe engines inside a rating list is a slap in face for this group of programmers. Here one point only is really enough!

Situation is difficult!

I think I have a good solution if visitors from my own work can directly see in my list ...

1. Not an own rank for NNSf / NNRe ...
2. NNSf and NNRe engines can be easy detected with the name in the list!

After all ...
I can understand if others have no interest to test "NNSf" or "NNRe" but I am to 55% - 75% sure it is wrong.

Best
Frank

Ovyron · Post by **Ovyron** » Mon May 16, 2022 1:18 am

connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pm There are definitely things to be gained in experimenting with new network architectures and implementations. This is how progress is made and differentiation is achieved. Your comment is rather nonsensical.

No, forget everything you know and try to see this with fresh eyes: What is nonsensical is trying to improve your engine and have progress with it when Stockfish exists and it already has better eval and search than your engine, and no matter what you do, you won't be able to catch it!

Chess programmers devote their lives to designing a wheel when a perfectly great wheel is available and even they're invited to join the Stockfish wheel and try to improve that one.

Since improving an engine is nonsensical, what's left? Well, duh, THE JOY OF PROGRAMMING!

Just, programming a chess engine because you enjoy it. There's no final goal but the enjoyment of programming itself.

And in that case, if someone doesn't enjoy the programming the evaluation part, why can't they just copy and paste that part of the wheel and enjoy working on the rest without being treated like criminals?

connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pmWho wants to test a bunch of engines which evaluate positions identically?

Huh, because it doesn't? Yesterday a new SF net was released which became the default, and it evaluated positions completely different from the previous one. With some positions that old one said -0.20 but new one said 0.20, and if you pitch the Stockfishes you can get some funny games with 2 Knights v rook and pawn where both nets think their side has the advantage...

If this is your argument then just let the engines have a different NN and then they'll all evaluate differently, perhaps what is best for SF isn't best for your search, look for the best SF NN for the engine, and the argument is killed as all of them evaluate with a different net.

connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pm Testers/users are free to do as they please and, likewise, authors are not entitled to have their engines tested.

As I said, that's fine and dandy, what is not is claiming those that remain in classical eval, code their own nn probing code, or create their own net are "doing it right" while the copy pasters are "doing it wrong." Because testers forget why all of this is done, not about the progress, but about the enjoyment (which maybe they don't understand because they haven't programmed an engine and just test them, like people that never had children giving advice to parents.)

connor_mcmonigle · Post by **connor_mcmonigle** » Mon May 16, 2022 2:08 am

Ovyron wrote: ↑Mon May 16, 2022 1:18 am
connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pm There are definitely things to be gained in experimenting with new network architectures and implementations. This is how progress is made and differentiation is achieved. Your comment is rather nonsensical.
No, forget everything you know and try to see this with fresh eyes: What is nonsensical is trying to improve your engine and have progress with it when Stockfish exists and it already has better eval and search than your engine, and no matter what you do, you won't be able to catch it!

Chess programmers devote their lives to designing a wheel when a perfectly great wheel is available and even they're invited to join the Stockfish wheel and try to improve that one.

Since improving an engine is nonsensical, what's left? Well, duh, THE JOY OF PROGRAMMING!

Just, programming a chess engine because you enjoy it. There's no final goal but the enjoyment of programming itself.

And in that case, if someone doesn't enjoy the programming the evaluation part, why can't they just copy and paste that part of the wheel and enjoy working on the rest without being treated like criminals?

connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pmWho wants to test a bunch of engines which evaluate positions identically?
Huh, because it doesn't? Yesterday a new SF net was released which became the default, and it evaluated positions completely different from the previous one. With some positions that old one said -0.20 but new one said 0.20, and if you pitch the Stockfishes you can get some funny games with 2 Knights v rook and pawn where both nets think their side has the advantage...

If this is your argument then just let the engines have a different NN and then they'll all evaluate differently, perhaps what is best for SF isn't best for your search, look for the best SF NN for the engine, and the argument is killed as all of them evaluate with a different net.

connor_mcmonigle wrote: ↑Sat May 14, 2022 8:53 pm Testers/users are free to do as they please and, likewise, authors are not entitled to have their engines tested.
As I said, that's fine and dandy, what is not is claiming those that remain in classical eval, code their own nn probing code, or create their own net are "doing it right" while the copy pasters are "doing it wrong." Because testers forget why all of this is done, not about the progress, but about the enjoyment (which maybe they don't understand because they haven't programmed an engine and just test them, like people that never had children giving advice to parents.)

Anecdotally, what motivates most engine programmers is the potential of stumbling across some novel idea. Chess engines aren't wheels insofar as they are extremely multifaceted. An author's engine being far weaker than SF does not preclude the possibility of said author discovering a novel improvement/enhancement of an existing idea.

Just because you see -0.20 in one position whereas some previous network said +0.20 does not mean the network is evaluating positions "completely different". If you actually do the statistics, you'll see a very high correlation in evaluation between any two SF networks relative to the correlation you'd obtained were you to compare to some other engine which uses an evaluation function other than one copied from SF (Koivisto, Berserk, Ethereal, Seer, etc.). There isn't much room for argument here.

I don't see anyone being treated as criminals. That's pretty hyperbolic. Less original engines are inherently less technically interesting and consequently tested with lower frequency. Nowhere did anyone claim that authors of engines copying Stockfish's evaluation function are "doing it wrong", though it's also not entirely clear what you mean by that. There is the question of how best to include such engines in rating lists. I'm of the opinion that Frank's solution, in which such engines aren't ranked and are clearly delineated, is reasonable, though I'm also not opposed to testers who refuse to test such engines entirely.

Ovyron · Post by **Ovyron** » Mon May 16, 2022 6:45 am

connor_mcmonigle wrote: ↑Mon May 16, 2022 2:08 am Anecdotally, what motivates most engine programmers is the potential of stumbling across some novel idea.

So let's talk about motivation. If anybody needs motivation it means they weren't enjoying doing something in the first place. Like a kid that doesn't want to do their homework, but if they do they get some candy. That's their motivation. If they got the candy anyway, they wouldn't do their homework.

These engine programmers are actually addicted and follow their dopamine rushes trying to chase that novel idea, and when they retire they realize they wasted their time. It'd be the greatest shame of the hobby if it was "most of them", because most of them didn't come with any novel idea, and the ones with the most important novel ideas of recent times appear in Stockfish's credits, full of people that didn't even need to program their own engine.

connor_mcmonigle wrote: ↑Mon May 16, 2022 2:08 am If you actually do the statistics, you'll see a very high correlation in evaluation between any two SF networks relative to the correlation you'd obtained were you to compare to some other engine which uses an evaluation function other than one copied from SF

You're the one needing to provide proof for that claim, people are just assuming the nets produce similar outputs, they don't. Go and compare Stockfish 13052022 + with nn-d0b74ce1e5eb.nnue net against Stockfish 15052022 + nn-3c0aa92af1da.nnue net. It's called NNUE 2 for a reason. Radically different moves in 2 days.

connor_mcmonigle wrote: ↑Mon May 16, 2022 2:08 amI'm of the opinion that Frank's solution, in which such engines aren't ranked and are clearly delineated, is reasonable, though I'm also not opposed to testers who refuse to test such engines entirely.

But why this stance instead of the one where you say "though I'm also not opposed to testers who refuse to test engines with original probing code for networks or original networks"? You're clearly biased in favor of these and against the others, the Wrong/Right side I'm talking about. You can't stay neutral, you discriminate the ones that copy the code or the network, and that's beyond discussion.

Frank Quisinsky · Post by **Frank Quisinsky** » Mon May 16, 2022 7:33 am

Ovyron,

many things would be simpler in the world if everything were simple. All the self-appointed policemen would have nothing to discuss. We don't want that either.

The first and, in my opinion, the most important question would be:
Really good results, conquests from the past can be found today in countless GPL3. This cancels out the causal chain. Nobody like to read that because people used GPL3 are thinking that all is right.

Often it is better to accept situations if complicated things goes a very own way.
We know that many thousands / millions of "Ovyron's" are on the way but again ...

To accept is much more important as to search the right way.
The "right way" very often does not exist ... many people have to learn that in life.

Best
Frank

connor_mcmonigle · Post by **connor_mcmonigle** » Mon May 16, 2022 7:39 am

Ovyron wrote: ↑Mon May 16, 2022 6:45 am ...
So let's talk about motivation. If anybody needs motivation it means they weren't enjoying doing something in the first place. Like a kid that doesn't want to do their homework, but if they do they get some candy. That's their motivation. If they got the candy anyway, they wouldn't do their homework.

These engine programmers are actually addicted and follow their dopamine rushes trying to chase that novel idea, and when they retire they realize they wasted their time. It'd be the greatest shame of the hobby if it was "most of them", because most of them didn't come with any novel idea, and the ones with the most important novel ideas of recent times appear in Stockfish's credits, full of people that didn't even need to program their own engine.
...

I fail to see your point. Motivation is the reason behind action - independent of enjoyment. In my case, I derive enjoyment from the process of searching for new ideas/improvements. This enjoyment of the process itself serves as auxiliary motivation for engine development, though is largely secondary. Effectively all top engines have at least one or two "novelties" associated with them. Many of these novelties, which have passed SPRT for other top engines, fail in Stockfish, though this is more a testament to how different top engines' search functions behave than indicative of some inherent failing of the ideas themselves. Nevertheless, I can point to numerous heuristics/tweaks in Stockfish which originate from other engines.

Ovyron wrote: ↑Mon May 16, 2022 6:45 am You're the one needing to provide proof for that claim, people are just assuming the nets produce similar outputs, they don't. Go and compare Stockfish 13052022 + with nn-d0b74ce1e5eb.nnue net against Stockfish 15052022 + nn-3c0aa92af1da.nnue net. It's called NNUE 2 for a reason. Radically different moves in 2 days.

I've performed the experiment described (comparing static evaluation correlations across a large sample of positions) for previous SF networks and can do so again if you'd like. Somehow, I doubt you'd find it convincing as you seem to already be pretty certain of the soundness of your own opinions...

But why this stance instead of the one where you say "though I'm also not opposed to testers who refuse to test engines with original probing code for networks or original networks"? You're clearly biased in favor of these and against the others, the Wrong/Right side I'm talking about. You can't stay neutral, you discriminate the ones that copy the code or the network, and that's beyond discussion.

Pairing Stockfish's evaluation function with a similar, yet inferior, search function is not remotely technically interesting in my opinion nor many others'. Consequently, I'd prefer to not see such engines consume too much of testers' resources. That's my bias. So long as more original/technically interesting efforts aren't passed up in favor of lower effort (partial) clones, I'm not particularly inclined to care. Regardless, testers are ultimately free to do as they please so this discussion is quite pointless.

Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...

Re: Andscacs 0.1 with NN file from Coiled ...