Candid question about nets (NNUE and others)

Pi4Chess · Post by **Pi4Chess** » Wed Dec 02, 2020 4:35 pm

I was thinking about an idea of dedicated nets.

Is a net trained only with games of a certain opening would be better against a generalistic trained net like we have with engines ? (for example could be general like open vs closed opening net but may be also the ruy López or the sicilian or the reti net etc)

If the answer is no then nevermind.

If the answer is yes then it would be nice for an engine to be able to pick the best net he has for it after his book ends amongt several nets specialized for patterns in a certain opening?

Cornfed · Post by **Cornfed** » Wed Dec 02, 2020 6:11 pm

Pi4Chess wrote: ↑Wed Dec 02, 2020 4:35 pm I was thinking about an idea of dedicated nets.

Is a net trained only with games of a certain opening would be better against a generalistic trained net like we have with engines ? (for example could be general like open vs closed opening net but may be also the ruy López or the sicilian or the reti net etc)

If the answer is no then nevermind.

If the answer is yes then it would be nice for an engine to be able to pick the best net he has for it after his book ends amongt several nets specialized for patterns in a certain opening?

Is there some downside to one net that incorporates differing positions/structures?

mmt · Post by **mmt** » Wed Dec 02, 2020 6:18 pm

Yes, they would be better, assuming you spent the same amount of time training each of them as the general net. Irreversible openings make the tree smaller after all and each opening should still have more than enough data to generalize. But would they be better than the general net if you trained the general net for as long as you trained the specialized opening nets overall? Probably not. So you'd need some smart way to do this instead of just training multiple specialized nets by themselves if the training time/game generation time is the main limitation (I know it is for LC0, not sure about SF NNUE). Maybe the nets could have different architectures depending on the opening as well.

Pi4Chess · Post by **Pi4Chess** » Wed Dec 02, 2020 8:07 pm

mmt wrote: ↑Wed Dec 02, 2020 6:18 pm Yes, they would be better, assuming you spent the same amount of time training each of them as the general net.

Has it been tested yet at a tiny scale ?

Irreversible openings make the tree smaller after all and each opening should still have more than enough data to generalize. But would they be better than the general net if you trained the general net for as long as you trained the specialized opening nets overall? Probably not. So you'd need some smart way to do this instead of just training multiple specialized nets by themselves if the training time/game generation time is the main limitation (I know it is for LC0, not sure about SF NNUE). Maybe the nets could have different architectures depending on the opening as well.

Yeah time seems the key there. So if we assume this hypothesis for example : in the case of a dedicated net to 1.e4 and other open games + another dedicated to 1.d4 and other closed games (please no debate about open vs closed it is just a not accurate example) that have been trained each the same number of matches than a generalistic net, then the 2 nets would be better than only one if used when the position comes to what they have veen trained for, but not if each "dedicated net" is trained half the number of matches ?

So in this case it would not be a matter of time because the nets can be trained in parallel during the same time. Just a matter of money and computational capacity.

But of course all of this are just "unaware" wondering.

Pi4Chess · Post by **Pi4Chess** » Wed Dec 02, 2020 8:13 pm

Cornfed wrote: ↑Wed Dec 02, 2020 6:11 pm Is there some downside to one net that incorporates differing positions/structures?

I am not thinking of a downside of 1 net. I am wondering of the possibility of an upside in performance result of 2 or more "dedicated nets" (by opening typology for the moment).

dkappe · Post by **dkappe** » Wed Dec 02, 2020 8:30 pm

I can only speak to endgame nets. With leela nets and positions of 18 pieces or fewer, the endgame nets performed dramatically better than generalist nets of comparable size. The same experiment with NNUE resulted in a much weaker net at endgames than a generalist net.

Pi4Chess · Post by **Pi4Chess** » Wed Dec 02, 2020 10:09 pm

dkappe wrote: ↑Wed Dec 02, 2020 8:30 pm I can only speak to endgame nets. With leela nets and positions of 18 pieces or fewer, the endgame nets performed dramatically better than generalist nets of comparable size. The same experiment with NNUE resulted in a much weaker net at endgames than a generalist net.

Thanks for the input. It seems that there is a complete new world of experimentation on how to use/combine neural nets power.

dkappe · Post by **dkappe** » Thu Dec 03, 2020 1:21 am

These are very different types of neural nets. The shallow, fully connected NNUE are far less complex, but are being asked to do a simple job.

Candid question about nets (NNUE and others)

Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)

Re: Candid question about nets (NNUE and others)