SF-NNUE going forward...

cma6 · Post by **cma6** » Wed Aug 05, 2020 4:41 pm

"The analysis of the first net I downloaded is getting obsolete now, I wonder how long current one will last."

Ovyron,
Does that mean that you usually use whatever is the latest (non-experimental) net from Sergio?

Zenmastur · Post by **Zenmastur** » Thu Aug 06, 2020 1:15 am

Ovyron wrote: ↑Wed Aug 05, 2020 2:04 pm
Ovyron wrote: ↑Fri Jul 31, 2020 4:35 am
Zenmastur wrote: ↑Fri Jul 31, 2020 4:12 am I've noticed some unusual and VERY interesting behaviors as well.
Yeah, I had a record time of analysis obsolescence. Before NNUE I could count about a 11 months window (analysis from around September 2019 was becoming obsolete.) After NNUE, I can say that ALL my analysis with Stockfish dev became obsolete in one shot (because it's unreliable, the analysis can't be mixed, because the backsolved wrong score of a single Stockfish dev analyzed line could mess everything up.)

I can finally say that all the positions I've analyzed have been nothing but wasted time now - but NNUE has leveled the field, and draw rates have gone down, there has never been a better point to start from scratch!
UPDATE - The field has leveled nothing now, even people that do daily updates to their books have sunk in elo and are struggling to recover it. Realistically I'm playing some 100 elo higher than before NNUE, but my elo has sunk to 100 less than what it was before, because the bigger your hardware the better NNUE gets, so the people at the top increased over their already high performance, while the below average performance can't hold up.

The skill level has greatly widened.

While before I could be out-searched by 10 depth and still survive, now being outsearched by 3 depth can be fatal. The score can be 0.30 now, 1.30 3 depths later. The analysis of the first net I downloaded is getting obsolete now, I wonder how long current one will last.

It's as if what NNUE did was getting the game into the most difficult positions that it can, if you think about it that's like Contempt on steroids, once you're on those positions every node searched counts, this might have been the point in history where hardware became more important than the book used.

The new nets don't seem to be that much better than some of the older ones. I haven't test the last few yet. But unless Sergio has changed something with his FEN generation for training I'm not expecting any great advances. We need either bigger nets, split nets, or deeper searches. Not sure which would yield the most gains.

cma6 · Post by **cma6** » Thu Aug 06, 2020 1:36 am

Zen:
Thanks for the explanation. I was wondering about the size of Sergio's NN_SF nets at 20 MB. How does that compare to the size of the lco nets; and why not create much larger NN_SF nets?

Zenmastur · Post by **Zenmastur** » Thu Aug 06, 2020 3:05 am

cma6 wrote: ↑Thu Aug 06, 2020 1:36 am Zen:
Thanks for the explanation. I was wondering about the size of Sergio's NN_SF nets at 20 MB. How does that compare to the size of the lco nets; and why not create much larger NN_SF nets?

Lc0 nets are much larger and have a different structure. They run best on GPU's which limits their usefulness IMHO. NNUE nets are designed to be fast when run on CPU's. This allows them to takes advantage of CPU with many cores. Large Lc0 nets are limited by the number of expensive GPU's you can cram into a machine and scaling factors.

My initial testing showed NNUE played the opening less well than SF. Further analysis shows the difference isn't as great as I though but still significant/measurable. NNUE also seem to be less well suited to endgame analysis than regular SF. Some of this can be attributed to the speed differences between SF and SF-NNUE, but that isn't the only factor as far as I can tell. FEN generation WITHOUT the use of Table bases may be why endgame play suffers. Deeper searches, in general, for FEN generation may help, bigger nets MAY help but will further reduce NPS, so it's unclear how much of an advantage, if any, this will provide. Split nets for different parts of the game, opening, middlegame, and endings will maintain speed and "should" provide advantage for each phase of the game.

I suspect we will see some change on Sergio's end if there is no further gains in playing strength. Just not sure what they will be.

Dann Corbit · Post by **Dann Corbit** » Thu Aug 06, 2020 5:18 am

It seems to me that the size of the net that is optimal is likely a function of the power of the computer.
A one core Pentium would need a tiny net, and a 128 core dual Epyc would be able to use a much larger net.

I don't have any proof, just another thought experiment

cma6 · Post by **cma6** » Thu Aug 06, 2020 5:25 am

"speed differences between SF and SF-NNUE".

That is a factor I have often seen mentioned in a comparisons between SF and SF-NNUE. However, that seems like an exaggerated concern if one has even a halfway decent CPU setup.

On my 4-year-old dual Xeon Intel-Xeon-E5-2686-V3 (2 x 18 = 36 physical cores), I find SF-NNUE getting to 50 to 55 ply within 15-20 minutes routinely in a variety of positions from early middlegame to endgame in my correspondence games. My system is not particularly fast these days. Many players have at least an AMD 3970X system (32 physical cores), which is much faster.

Zenmastur · Post by **Zenmastur** » Thu Aug 06, 2020 5:36 am

Dann Corbit wrote: ↑Thu Aug 06, 2020 5:18 am It seems to me that the size of the net that is optimal is likely a function of the power of the computer.
A one core Pentium would need a tiny net, and a 128 core dual Epyc would be able to use a much larger net.

I don't have any proof, just another thought experiment

It seems to me that if you are running 128 threads and each thread is handling part of the search then it will also have to handle the evaluation of the leaf nodes, which means it has to run the evaluation function. This mean you could have 128 threads each evaluation a different leaf node. Which means each thread must be able to run the evaluation function efficiently. So, I fail to see how having more cores will allow the use of larger nets WITHOUT degrading each cores NPS.

Ovyron · Post by **Ovyron** » Thu Aug 06, 2020 10:28 am

cma6 wrote: ↑Wed Aug 05, 2020 4:41 pm Ovyron,
Does that mean that you usually use whatever is the latest (non-experimental) net from Sergio?

No.

There's a very important factor on here: I'm playing against people that are using NNUE nets on faster hardware. Using the same net as them is just suicide, because they're seeing everything I can see, and then more by reaching deeper (it is obvious as we'll show similar evals for the whole game without disagreement, and I'll get cracked by the outsearch).

To beat them I've needed to use a different net than them. Despite them being around the same elo, they can have radical disagreements in their evaluation of positions (mainly, if a given out-of-book position is favorable for black, or for white.) If my net is better than their net at the opening position reached, they're doomed, and their hardware advantage doesn't matter.

Interestingly, for this both nets can be wrong, I just need mine to be less wrong than theirs. Say, in reality black has the advantage, but both nets say white has it. As long as I'm black I'll win, so I've needed to use extreme score fudging for my backsolved scores (say, an opening position where their net says 0.30, my net says 0.40, but I have it as -0.11 because black is better), causing really huge swings in scores (the french is currently on top

- e4 is winning the war against d4. The Scandinavian shortly appeared as best

- the opening position has jumped to a 0.18 score... tomorrow things can look extremely different. For comparison, from 2018 to mid-2019 e4, d4, Nf3 and c4 were tied at 0.00, and it took them year and a half to budge, now they're jumping all around the place.) This has become opening exploration with whatever net they chose.

But the path to success is clear: use the rarest net (one nobody else is using), make the book reach the positions where this net is right, and the others wrong. Easier said than done, but in Stockfish-dev times I didn't have a hope like this, nor was I able to clearly defeat these guys, and since they're some 100 elo stronger than they were before, beating them means something.

Dann Corbit · Post by **Dann Corbit** » Thu Aug 06, 2020 4:54 pm

Zenmastur wrote: ↑Thu Aug 06, 2020 5:36 am
Dann Corbit wrote: ↑Thu Aug 06, 2020 5:18 am It seems to me that the size of the net that is optimal is likely a function of the power of the computer.
A one core Pentium would need a tiny net, and a 128 core dual Epyc would be able to use a much larger net.

I don't have any proof, just another thought experiment
It seems to me that if you are running 128 threads and each thread is handling part of the search then it will also have to handle the evaluation of the leaf nodes, which means it has to run the evaluation function. This mean you could have 128 threads each evaluation a different leaf node. Which means each thread must be able to run the evaluation function efficiently. So, I fail to see how having more cores will allow the use of larger nets WITHOUT degrading each cores NPS.

Even at that, the new architecture cores run at higher speed, and with lots ore cores you will also have a lot more hash hits. Considering that the eval is expensive, a zero ply eval hash might be a good idea

Dann Corbit · Post by **Dann Corbit** » Thu Aug 06, 2020 4:55 pm

Zenmastur wrote: ↑Thu Aug 06, 2020 5:36 am
Dann Corbit wrote: ↑Thu Aug 06, 2020 5:18 am It seems to me that the size of the net that is optimal is likely a function of the power of the computer.
A one core Pentium would need a tiny net, and a 128 core dual Epyc would be able to use a much larger net.

I don't have any proof, just another thought experiment
It seems to me that if you are running 128 threads and each thread is handling part of the search then it will also have to handle the evaluation of the leaf nodes, which means it has to run the evaluation function. This mean you could have 128 threads each evaluation a different leaf node. Which means each thread must be able to run the evaluation function efficiently. So, I fail to see how having more cores will allow the use of larger nets WITHOUT degrading each cores NPS.

Even at that, the new architecture cores run at higher speed, and with lots ore cores you will also have a lot more hash hits. Considering that the eval is expensive, a zero ply eval hash might be a good idea

SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Speed difference

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...