Slow 2.4 vs Stockfish 11 scored -35 elo (general endgame suite.)
I didn't fully train all the nets, I expect the path to -10 elo or better against SF11 with more training and maybe minor adjustments is straightforward, based on the most trained subsets.
I have 8 endgame nets. These are the nets I have active :
Code: Select all
EndgameNets.push_back(new OnePieceEndgameNet(ROOK)); EndgameNets.push_back(new OnePieceEndgameNet(BISHOP)); EndgameNets.push_back(new OnePieceEndgameNet(QUEEN)); EndgameNets.push_back(new OnePieceEndgameNet(KNIGHT)); EndgameNets.push_back(new OneDiffPieceEndgameNet(BISHOP,KNIGHT)); EndgameNets.push_back(new OneDiffPieceEndgameNet(ROOK,BISHOP)); EndgameNets.push_back(new OneDiffPieceEndgameNet(ROOK,KNIGHT)); EndgameNets.push_back(new GeneralEndgameNet());
With more opening variety in the last test against SF11 I ran, the Rook net was even after 8000 games so 10000 opening positions did still lead to a minor amount of overfit. (I made pgn->epd that exports first position in game that net is active for, so I think I now have 50,000+ positions for each type.)
I did also improve the speed, mostly by converting the int16 weights, which also made for smaller file sizes. I went back to STM as just another input, and export training data twice, once with sides flipped. The net layers for onePiece are 192 x 32 x 32 x 1. I slightly increased first later so weights would be multiple of 16 for int16 SIMD, but not a big difference.
It would take me about 24 hours to train a one piece endgame net that would perform on par with Slow 2.3, which would be overfit with some weird values, but get enough important stuff right to be as good or better. After a couple more days it would be clearly better, but not yet even with SF11. The only nets I tried to train enough to match SF11 seemed likely successful.
In the beginning trying out neural nets was very interesting, although once I got the process down better the time needed for training/testing was kinda long and my interest waned. (It would be interesting again to experiment with if I had like 10+ times the computing power, but as is feels like most experiments don't cause major differences so just feel unclear.)
Angrim : I would agree that this is not NNUE style, although I do try to be efficient. I'm still not doing incremental updates but may do that later. I of course only update active inputs and skip the 0's. In fact since they are all 1 or 0 inputs only have to do an add of the weights for first layer. Then the actual net sizes aren't that big.