SF-NNUE going forward...

Zenmastur · Post by **Zenmastur** » Mon Jul 27, 2020 2:34 am

I'm wondering what the next step in the evolution of SF-NNUE will be. It occurred to me that a single evaluation function isn't likely to be as good at all phases of the game. So, is the next step a larger net to better encompass all phases of the game? Or maybe a bigger net AND using a GPU to help speed things up a bit.

Or would it be better to keep the net about the same size and have multiple nets for different phases of the game? An example would be a net trained on positions with 17 pieces or more and one trained on positions with 16 pieces or less? Or, perhaps, 3 nets, one for position with 24 pieces or more, one for 23-13 pieces and one for endgame play all wrapped/contained in a single wrapper/file.

What do you think would be the best way forward? Or do you think the "Status Quo" is good enough?

Regards,

Zenmastur

Ovyron · Post by **Ovyron** » Mon Jul 27, 2020 3:13 am

I see nets being good all-around, opening, middle game, endgames... if they had a weak spot it'd be apparent where, so a specialized net could be built for that weakness, but mainly, I haven't seen a position where traditional Stockfish has a better eval than NNUE (for comparison, positions where Stockfish has better eval than Leela are very common).

I think the future will be on depth. So far the nets have been trained with very low depths, right now a net trained with Depth 50 is laughable due to the big number of positions required, but if we already had such a trained net perhaps it'd prove itself superior to all the low depth nets.

Fewer nets, better trained.

Zenmastur · Post by **Zenmastur** » Mon Jul 27, 2020 6:30 am

Ovyron wrote: ↑Mon Jul 27, 2020 3:13 am I see nets being good all-around, opening, middle game, endgames... if they had a weak spot it'd be apparent where, ...

I'm not so sure that's true. SF had known weaknesses in its evaluation function before Alpha0 and Lc0 came along. But the weaknesses that were exploited by NN programs weren't always the “apparent” ones. Many of the weaknesses were unrecognized before they were pointed out by example games. So, I think it's plausible that there are many weaknesses that aren't apparent to you or anyone else. It seem unlikely to me that a single small net would be just as adept at evaluation a blocked opening position an d a 12-man endgame position. Maybe I'm wrong, but the two situations seem too dissimilar to be evaluated equally well with a single evaluation function. There is clearly a limit to the amount of information that can be stored in a NN of a particular size. What exactly the limit is I don't know, but considering the relatively small size of the nets used it would seem wise to explore the use of larger and/ more specialized nets to determine if they are worth the effort.

Ovyron wrote: ↑Mon Jul 27, 2020 3:13 am ...so a specialized net could be built for that weakness, but mainly, I haven't seen a position where traditional Stockfish has a better eval than NNUE (for comparison, positions where Stockfish has better eval than Leela are very common).

I have noticed a marked difference in the evaluation of endgames when using TB's. I'm not sure exactly what the cause of the difference is but it seems I always get much better score in they analyze the same period of time. I realize there is a speed difference but this doesn't seem to account for the rather large differences I've seen.

Ovyron wrote: ↑Mon Jul 27, 2020 3:13 am I think the future will be on depth. So far the nets have been trained with very low depths, right now a net trained with Depth 50 is laughable due to the big number of positions required, but if we already had such a trained net perhaps it'd prove itself superior to all the low depth nets.

Fewer nets, better trained.

So you think the net size is good enough as is?

Depth 50? I thought about higher depths, but 50 plies is a pipe dream at best. The depths used thus far, do seem way too shallow, but because of the numbers of positions needed I'm not sure how much deeper you can go and still be able to produce sufficient quantities. IIRC SF around low teens or so seems to be a sweet spot for speed vs depth. I'd have to go back and look at my data as I don't recall exactly why I drew this conclusion. It had something to do with me mass analyzing and/or playing games to depth "x" for use in an opening book. I used a few tricks to speed things up a bit. But greater depth is definitely something to try. It will be interesting to see how much better a net gets just because the position have been searched deeper. I'll be very surprised if there is a large improvement.

Regards,

Zenmastur

Ovyron · Post by **Ovyron** » Mon Jul 27, 2020 7:35 am

Zenmastur wrote: ↑Mon Jul 27, 2020 6:30 am I have noticed a marked difference in the evaluation of endgames when using TB's. I'm not sure exactly what the cause of the difference is but it seems I always get much better score in they analyze the same period of time. I realize there is a speed difference but this doesn't seem to account for the rather large differences I've seen.

What I mean is difference at the "game result" level. It doesn't matter if 3.00 is inaccurate and 1.00 is accurate and you spend time training the net so it shows 1.00, if the result of the game is the same.

What you need to show is a position where NNUE loses, or misses a win (or something like that) because of the eval difference, not a big eval difference that still produces the same game result.

Zenmastur wrote: ↑Mon Jul 27, 2020 6:30 amSo you think the net size is good enough as is?

Currently there's 20MB nets and 30MB nets and that's a 50% increase in size for no improvement that I've seen.

Zenmastur wrote: ↑Mon Jul 27, 2020 6:30 amIIRC SF around low teens or so seems to be a sweet spot for speed vs depth.

You want speed to analyze more positions. Double the speed, double the positions. The problem I'm seeing is that after some positions it plateaus and there's diminishing results so more positions don't help. You could as well analyze the same ones with more depth.

We're currently in a crisis at which 60% performance against Stockfish 11 hasn't been achieved by any net, if more positions don't help and more depth doesn't help, and bigger net doesn't help, then, yeah, let's start trying other things like multiple nets for game stage.

cdani · Post by **cdani** » Mon Jul 27, 2020 8:01 am

My first idea was specialized nets. For example a net for rook endgames. But the fact that the previous net that arrives to a rook endgame does not know about those endgames, will weaken it. So once you start using a net, I think the best is use it till the end.
So maybe various nets specialized in 1.d4, another in 1.e4 or in Sicilian,...

peter · Post by **peter** » Mon Jul 27, 2020 8:10 am

Ovyron wrote: ↑Mon Jul 27, 2020 7:35 am We're currently in a crisis at which 60% performance against Stockfish 11 hasn't been achieved by any net, if more positions don't help and more depth doesn't help, and bigger net doesn't help, then, yeah, let's start trying other things like multiple nets for game stage.

Guess as for analysis and game play we don't need more depth, what we'd need was more reliable depth.
We can let play out any position by any engine on its own and take the result for granted, or we can forward- backward the postion to a depth at which we are sure about the outcome, but what we can't, is keep reliable eval and output- line backward to root position in hash for positions of really unclear outcome, e.g. for opening positions.

So why not go on with what we already have, and this one probably greatest benefit of A-B-search, the hash- learning?

Experience- files are common in A-B-engines since decades, Shredder, Hiarcs, Houdini up to version 4, SF PA (persisted analysis by Jeremy Bernstein) all were great milestones to me in development of hash- learning based position- learning.

As long as even creating NNUE- nets take their hardware- time yet still, even if much more easy and fast and guided to be built than LC0- like NNs,, why not combine NNs with position- learning files based on hash?
Experience- files like Zerbinati, Manzo and Omar created based on Kiniama- code work well as for my pov, the short time I see them working together with NNUE, I'd give it much more chance to develop into better tools of analysis as for reliable output- lines and -evals in Forward- Backward and in game- playing than I already saw some weeks ago.
Just my two cents.

P.S. I liked Houdini's learn- file and SF PA's- learn-file even more than latest SF- experience, as for amount of hash- entries stored per hardware- time and position, for both one could adapt the growing- speed per pondering by treshold- boundary or depth of storage. Fine instruments even then these were, boundaries then laid in search and eval of engines, that's the same now, but A-B-search together with NNUE- nets has come to a much better point now, so position- learning based on hash- entries works much better now too.

Rowen · Post by **Rowen** » Mon Jul 27, 2020 9:09 am

Hi
Perhaps my presumptions are incorrect, but could specialised nets be created that train an engine to play like a human or humans with a particular strength , personality, characteristic, or play like Tal etc, etc.
Thanks

cdani · Post by **cdani** » Mon Jul 27, 2020 9:58 am

I also think that tuning search for the new NNUE eval will net nice redits. This will complicate the code of Stockfish, though.

peter · Post by **peter** » Mon Jul 27, 2020 10:05 am

cdani wrote: ↑Mon Jul 27, 2020 9:58 am I also think that tuning search for the new NNUE eval will net nice redits. This will complicate the code of Stockfish, though.

Tuning search is always fine, tuning position- learning code based on hash- learning wouldn't make code much more complicated, I think, even if I'm not good enough in proramming to prove so.
Andscacs had a nice hash- storage before SugaR had so, no interest in position- learning based on selected hash- entries, Daniel?

Ever looked at Jeremy Bernstein's SF PA?

Might still have the source somewhere, at least the one, Zerbinati made out of it for a revival a few years ago.

cdani · Post by **cdani** » Mon Jul 27, 2020 11:55 am

peter wrote: ↑Mon Jul 27, 2020 10:05 am Tuning search is always fine, tuning position- learning code based on hash- learning wouldn't make code much more complicated, I think, even if I'm not good enough in proramming to prove so.
Andscacs had a nice hash- storage before SugaR had so, no interest in position- learning based on selected hash- entries, Daniel?

Ever looked at Jeremy Bernstein's SF PA?

Might still have the source somewhere, at least the one, Zerbinati made out of it for a revival a few years ago.

Sure will be interesting things to do. But for the moment if I take again Andscacs I think it will be to tune it's static eval against NNUE eval.
Those other things are being done for Stockfish more or less, I think. Have not reviewed them.

SF-NNUE going forward...

SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...

Re: SF-NNUE going forward...