Thanks.
Fat Fritz question
Moderators: hgm, Rebel, chrisw
-
- Posts: 1080
- Joined: Fri Sep 16, 2016 6:55 pm
- Location: USA/Minnesota
- Full name: Leo Anger
-
- Posts: 3019
- Joined: Wed Mar 08, 2006 9:57 pm
- Location: Rio de Janeiro, Brazil
Re: Fat Fritz question
Yes, generating 16+ million Lc0 games for a 256x20 net at 900 nodes per move is a bit more time consuming and challenging, even with the best GPUs on the market (which was the 2080ti) when I did this. This is not to mention training and testing each and every possible NNUE structure such as:dkappe wrote: ↑Sun Jul 18, 2021 11:02 pmAs Sopel said, if you take the public data, etc., and just run the scripts, you basically get the master net, plus or minus some random variation. Zzzzzz.
If you use the data from other engines and build a somewhat different training framework (I’ve got a few laying about from non-chess pytorch projects) it’s a bit more challenging, especially if you have an order of magnitude less data than what the SF project brings to bear.
192x2x16x16
192x2x16x16
192x2x24x24
192x2x32x32
192x2x48x48
192x2x64x64
192x3x32x32
256x2x16x16
256x2x24x24
256x2x32x32
256x2x48x48
256x2x64x64
256x3x32x32
etc.
Each test took an average 4-5 days of computer time on a 32-thread machine, And of course the many hours testing each of them. Needless to say this was all done using the Nodchip training code, not the newer pytorch pipeline.
Some of the results I shared with Dkappe for the benefit of the Komodo project.
Yes, this was true even then.I’ve never been interested in training with stockfish data — there are already hundred or thousands of such nets.
"Tactics are the bricks and sticks that make up a game, but positional play is the architectural blueprint."