NNUE and game phase

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Dann Corbit
Posts: 12040
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

NNUE and game phase

Post by Dann Corbit » Mon Jan 18, 2021 12:54 pm

Strong chess engines have information to help them handle changes in game state from opening to middle to endgame (some more complicated than others)

Why not do this with NNUE engines?

Analyze 100 million opening positions to make an opening NNUE clump.
Analyze 100 million midgame positions to make a midgame NNUE clump.
Analyze 100 million endgame positions to make an ending NNUE clump.
Then smoothly interpolate as we go from one phase to the next.

IOW an edge A or H pawn is worth less than a center pawn in the opening and more in the endgame. One size shoe does not fit all.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.

Dann Corbit
Posts: 12040
Joined: Wed Mar 08, 2006 7:57 pm
Location: Redmond, WA USA
Contact:

Re: NNUE and game phase

Post by Dann Corbit » Mon Jan 18, 2021 12:57 pm

The reason that this occurs to me is that LC0 is a tiger in the opening and a pussy-cat in the endgame.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.

User avatar
hgm
Posts: 25948
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: NNUE and game phase

Post by hgm » Mon Jan 18, 2021 1:05 pm

A single NNUE does this already automatically. The KPST that determine the inputs of the NN (as SUM_over_sqr KPST[n][pieceType[sqr]][sqr][kingSqr]) can also be used to calculate game phase. (By making the KPST independent of sqr and KingSqr, and using the same sign for both colors.) And when it is useful, it will certainly learn how to do that.

Look
Posts: 262
Joined: Thu Jun 05, 2014 12:14 pm
Location: Iran
Full name: Mehdi Amini

Re: NNUE and game phase

Post by Look » Tue Jan 19, 2021 7:51 am

Hi,

As I mentioned in another thread there could be different types of NNUEs. The one related to game phase could be called King shelter. That is, bonus for pawns usually in front of the king. This should work well in middlegame , securing the king from opponents attacks. In endgame however this is not much of an issue, since king should become active like a knight or a bishop.
Farewell.

Sesse
Posts: 275
Joined: Mon Apr 30, 2018 9:51 pm
Contact:

Re: NNUE and game phase

Post by Sesse » Tue Jan 19, 2021 10:05 am

Look wrote:
Tue Jan 19, 2021 7:51 am
Hi,

As I mentioned in another thread there could be different types of NNUEs. The one related to game phase could be called King shelter. That is, bonus for pawns usually in front of the king. This should work well in middlegame , securing the king from opponents attacks. In endgame however this is not much of an issue, since king should become active like a knight or a bishop.
As was pointed out in both that thread and this thread, the network does this itself automatically if it finds it useful. It will find its own king shelter feature, its own consideration of game phase, and how to weight it depending on game phase. Except you can't really untangle it from everything else that's going on.

tomitank
Posts: 258
Joined: Sat Mar 04, 2017 11:24 am
Location: Hungary

Re: NNUE and game phase

Post by tomitank » Tue Jan 19, 2021 7:48 pm

Dann Corbit wrote:
Mon Jan 18, 2021 12:54 pm
Strong chess engines have information to help them handle changes in game state from opening to middle to endgame (some more complicated than others)

Why not do this with NNUE engines?

Analyze 100 million opening positions to make an opening NNUE clump.
Analyze 100 million midgame positions to make a midgame NNUE clump.
Analyze 100 million endgame positions to make an ending NNUE clump.
Then smoothly interpolate as we go from one phase to the next.

IOW an edge A or H pawn is worth less than a center pawn in the opening and more in the endgame. One size shoe does not fit all.
Because of its futile. It's depend on the inputs. NN learn this.
Please read more about neural networks.

mmt
Posts: 343
Joined: Sun Aug 25, 2019 6:33 am
Full name: .

Re: NNUE and game phase

Post by mmt » Thu Jan 21, 2021 1:03 am

tomitank wrote:
Tue Jan 19, 2021 7:48 pm
Because of its futile. It's depend on the inputs. NN learn this.
Please read more about neural networks.
Smug and wrong, not a good combo. It might easily turn out that a different board representation as the input to the NN or a different NN architecture or a different number of parameters is superior in the endgame as opposed to the opening because you might get a smaller or more parallelizable net, allowing deeper search, or faster training or lower resulting loss. E.g. if you're in an endgame without bishops and queens, all knowledge the NN has about bishops and queens could slow down inference and lead to a shallower search.

tomitank
Posts: 258
Joined: Sat Mar 04, 2017 11:24 am
Location: Hungary

Re: NNUE and game phase

Post by tomitank » Thu Jan 21, 2021 6:24 am

mmt wrote:
Thu Jan 21, 2021 1:03 am
tomitank wrote:
Tue Jan 19, 2021 7:48 pm
Because of its futile. It's depend on the inputs. NN learn this.
Please read more about neural networks.
Smug and wrong, not a good combo. It might easily turn out that a different board representation as the input to the NN or a different NN architecture or a different number of parameters is superior in the endgame as opposed to the opening because you might get a smaller or more parallelizable net, allowing deeper search, or faster training or lower resulting loss. E.g. if you're in an endgame without bishops and queens, all knowledge the NN has about bishops and queens could slow down inference and lead to a shallower search.
I add the NN evaluation to HCE. So I had to train an extremely little net. (768x16x1) And i used only 2.7M example.
This is not the norm either, but I don’t see the point in splitting it into multiple NNs.
If you have enough training example, this is not necessary. (IMO)
If there is one and working better, please let me know.

mmt
Posts: 343
Joined: Sun Aug 25, 2019 6:33 am
Full name: .

Re: NNUE and game phase

Post by mmt » Sat Jan 23, 2021 2:46 am

I don't know what would work well in your case. I did an experiment with a specific ending to check how well a custom-trained net can predict whether a position is a mate or not and the net could get 98% right vs 90% for SF NNUE. Based on this, I am sure that it's possible to improve SF NNUE by some Elo points by having multiple different nets, as this pretty much proves it. But is it worth the additional complexity and training costs for what could be a small gain? That's unclear. But the main idea does not deserve to be dismissed out of hand just because a general net can calculate the game state and "choose" different evals itself.

There is an important difference between NNs for chess and other board games and most other neural nets: we can generate huge amounts of training data very cheaply. This opens up some ways of doing things that don't generally apply to all NNs. Having multiple specialized nets with various architectures and inputs is one of the possibilities thanks to this, as you'll never run out of training data for your specialized net.

User avatar
hgm
Posts: 25948
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: NNUE and game phase

Post by hgm » Sat Jan 23, 2021 9:25 am

It is bound to be worse, because what you are in fact doing is forge a large net from multiple smaller nets, but then force a way to combine their results (like linear tapering) without giving the net the opportunity to optimize on that. If you would have started with a single bigger net, trained on the combined training sets, it would have had this opportunity, andwould have used it to do better.

Of course a larger net can do better than a small net; no surprise in that. It comes at a price, though, in terms of the nodes per second you can evaluate.

Post Reply