The thing that I'm finding interesting at the moment is Nemorino 6.00. With the addition of NNUE, Nemorino 6 has gained a massive amount of strength over version 5.00 and version 5.40. On Stefan Pohl's site, it basically achieved parity with Houdini 6.03 with a score of +219=563-218, over a 1,000 game sample. If we then use Houdini 6.03 as a comparison, this means that Nemorino 6.00, as strong as it has become in short order, is still a long way behind SF versions 9, 10 and 11, let alone 12 and 13dev.
Why didn't Nemorino become as strong as SF12? Why did its strength gain stop at the "Houdini level"? - I get that it could be SF12 level if it simply "plugged in" the relevant net.
Using it's own nets however, will it reach the SF10 or SF11 level in the next 3 months? Next 6 months? Ever?
Amidst the hand ringing that's going on about the uniqueness of evaluation and the death of computer chess as we know it, it would seem that unless people are outright using the exact same net as SF12 or SFdev, all that will happen is that the strength of engines has increased across the board, but the rankings haven't changed, they're still going to end up being similar to how they were before NNUE came along.
======
Andrew - There is nothing stopping you from continuing to work on Ethereal using handcrafted evaluation other than the fact that if you do, you won't be ranked 4th in the world any more (behind SF, Lc0 and Komodo). That seems to be the main issue for you.
If everyone else jumps on the NNUE bandwagon, Ethereal might end up being be ranked 20th of 30th a year from now, but does that actually matter?
If your only motivation was that you might one day overtake Komodo, that's pretty weird, because you can still use that way point, that benchmark. You can always play Komodo 14 against Ethereal 13.00 or whatever. Sure, in the outside world, on the ratings lists like Stefan Pohl and FastGM, you won't be near the top, but there will always be the opportunity for you, privately to assess the progress of handcrafted Ethereal against SF7, SF8, SF9, SF10, SF11, and any version of Komodo you may own.
Think about it. you never made money from Ethereal. You have no idea how many people actively use Ethereal, exactly what they use it for, or how frequently and intensively they use it. Sure, some people write to you occasionally on forums, email whatever and say "Hey thanks for making Ethereal, it's really great." but what does that matter? You may be able to look and see how many times Ethereal has been downloaded or compiled but you have no idea if any of those people use it regularly.
The point is, you're caught up with external validation and external metrics. By that I mean external to
you.
- How does Ethereal measure up against other engines on all the rating list websites?
How does Ethereal perform at TCEC compared to other engines?
How much praise and recognition do I get from other people because I am the author of Ethereal?
I get that these kind of metrics might have been the main motivation behind your work (is it really a hobby when you need external validation?) but just because it's going to become harder for your to cling on to those markers of "success", does that means you should give up?
The monetary investment you made in OpenBench, and the time investment you've made in Ethereal, they are
sunk costs so you shouldn't worry about either of them or feel that they were a waste.
I think the only thing that matters is how well Ethereal progresses against previous versions of Ethereal. Screw what the rest of the world is doing. Keep Ethereal open source, and if anyone else on the planet doesn't want to download or compile it, then who cares.
I remember when Daniel Jose Queralto (I hope I've spelled that correctly) said he was ceasing development of Andscacs. He said it was because there was too little progress for a large time investment. I didn't think for a minute he meant progress against Stockfish or Ethereal. He meant progress of Andscacs dev against the official release version 0.95. That's all. His measure seemed, at least to me, internal i.e. Andscacs dev vs Andscacs 0.95, not a case of Andscacs never winning TCEC or topping a ratings list.