Fat Fritz destroyed Stockfish!

Ozymandias · Post by **Ozymandias** » Mon Nov 18, 2019 9:36 am

Laskos wrote: ↑Mon Nov 18, 2019 1:25 amAnyway, it is fairly clear that Fat Fritz is in the same pool of main Lc0 zero runs, and if one wants to put it harshly, is just a crippled 42850 or T40B4_200. The supervised learning brought little even style wise.

This could mean that the zero approach is best for the project, or that the selection of games fed to the training wasn't optimal. In this case it's probably both. I know for a fact that people heavily underestimates what you need to create a good DB, the only other person I've come across who actually does what needs to be done, is Nelson Hernandez.

corres · Post by **corres** » Mon Nov 18, 2019 9:36 am

I think especially in the case of a relatively weak GPU (weaker than RTX 2080 Ti) during endgame Leela is weakened considerably if the time increment is too low (< 1 sec). Because of this it is more correct for Leele using some minutes plus (> 1 sec) / move Short Time Control.
Other note: Using ponder is correct only if the match is PC against PC and incorrect if the match is engine-engine because in the case of engine-engine the ponder disturb the calculation of the other engine.

George Tsavdaris · Post by **George Tsavdaris** » Mon Nov 18, 2019 10:30 am

Dann Corbit wrote: ↑Mon Nov 18, 2019 1:37 am Re: "The supervised learning brought little even style wise."
my impression is that fat fritz is clearly better than the average nn tactically and worse positionally.

I have no idea how this translates to games

My impression from seeing some 120+ games of Fat Fritz is exactly the same!!

Lion · Post by **Lion** » Mon Nov 18, 2019 10:49 am

George Tsavdaris wrote: ↑Sun Nov 17, 2019 7:16 pm Surely a nonsense title, but still its performance in this match was spectacular.
Of course too few games to conclude anything about their strength, but still Fat Fritz seems extremely strong!

Conditions:
Time control 1 minute + 1 second/move.
TCEC's Season 16 superfinal openings(50 so 100 games).
Stockfish dev (14 November 2019) with 30 cores on a Threadripper 2990WX 32cores/64 threads at 3.1 GHz.
Fat Fritz v266 with RTX 2080+2060 and 2 threads of the above 2990WX.
FatFritz(FF) was getting some 45 kN/s and Stockfish(SF) some 38 MN/s.
Full 3,4,5,6 and some 7 men syzygy TBs on NVME M.2 SSD.
I have chosen 30 cores for SF in order to get the same nodes per second ratio of SF/FF compared to TCEC so the performances could be similar to TCEC's.

Result:
FatFritz_v266 - Stockfish_141119, +19 -15 =66, 52.0-48.0, +14±40 Elo, LOS=75.4 %

In fact Chessbase with its 2 articles boycotted itself in promoting Fat Fritz!
As in the matches of the Chessbase articles, SF was having 16 000 000 N/s while FF 11 000 N/s.
If you compare this SF/FF N/s ratio is about 1.69 times bigger than what TCEC has for SF/T40_Leela so for SF/FF also(as T40's and FF's speed is similar as they use the same binary and nets are both 20x256).
So they gave a very big advantage to Stockfish over Fat Fritz compared to TCEC that they probably wanted to compare.

That is bad advertisement! They showed just +15 Elo performance of FF over SF-10 while by using a more comparable N/s ratio closer to TCEC, its difference would be even +50 Elo or more.
E.g just by using fp16 for FF in the Leela's Lc0 binary they would get a x3 N/s so some 40-60 Elo more.

Hello

I have FatFritz but what is "FatFritz_v266" is that a newer version of the weights?
If yes, where can I find it?

rgds

corres · Post by **corres** » Mon Nov 18, 2019 10:51 am

George Tsavdaris wrote: ↑Mon Nov 18, 2019 10:30 am
Dann Corbit wrote: ↑Mon Nov 18, 2019 1:37 am Re: "The supervised learning brought little even style wise."
my impression is that fat fritz is clearly better than the average nn tactically and worse positionally.
I have no idea how this translates to games
My impression from seeing some 120+ games of Fat Fritz is exactly the same!!

In partly this is also the effect of using human games for learning.

George Tsavdaris · Post by **George Tsavdaris** » Mon Nov 18, 2019 10:53 am

Lion wrote: ↑Mon Nov 18, 2019 10:49 am Hello

I have FatFritz but what is "FatFritz_v266" is that a newer version of the weights?
If yes, where can I find it?

Yes i think it's an updated one from the release of Chessbase.
Albert gave it to me. I guess it will be in the official update of Fritz 17 in some weeks.
In fact i think Albert said the update will be an even stronger version than v266.

Ovyron · Post by **Ovyron** » Mon Nov 18, 2019 10:59 am

Ozymandias wrote: ↑Mon Nov 18, 2019 9:36 amI know for a fact that people heavily underestimates what you need to create a good DB, the only other person I've come across who actually does what needs to be done, is Nelson Hernandez.

Really now? My Database is good enough to match yours and always come up with at least an equal position (once we're out of it), so either you need to count me as the third person, or make a claim about what makes a database "good" so I can see how mine doesn't apply.

Ozymandias · Post by **Ozymandias** » Mon Nov 18, 2019 11:56 am

Ovyron wrote: ↑Mon Nov 18, 2019 10:59 am
Ozymandias wrote: ↑Mon Nov 18, 2019 9:36 amI know for a fact that people heavily underestimates what you need to create a good DB, the only other person I've come across who actually does what needs to be done, is Nelson Hernandez.
Really now? My Database is good enough to match yours and always come up with at least an equal position (once we're out of it), so either you need to count me as the third person, or make a claim about what makes a database "good" so I can see how mine doesn't apply.

I haven't said that a good DB will get you better practical results than an inferior one. How is it any better? For example, you can have a 0.1% of duplicated games, instead of 1%. You can have unique names instead of several, for the same player. You can erase incorrect Elo values for players who where very young 30 or 40 years ago. The list of improvements goes on and on, because, we can agree those would be improvements, right?

BTW, I haven't used my DB in a game since 2014 and it hasn't even been updated since 2017.

Ovyron · Post by **Ovyron** » Mon Nov 18, 2019 12:16 pm

Ozymandias wrote: ↑Mon Nov 18, 2019 11:56 amBTW, I haven't used my DB in a game since 2014 and it hasn't even been updated since 2017.

So the question is why is it called a "good database" if there's no use for it? Has Nelson stopped updating his database as well, or what does he use it for?

I check my database every time I start a new game, and update it on a regular basis, but few games make it into it and I have found statistics become more accurate if I delete old irrelevant games, so it has been steadily shrinking, to the point where training a NN with it would probably produce garbage results, but I don't expect it would shine with one from 2017 games and before either. I wonder what kind of results Nelson's NN training would get, I'm afraid any DB would need a lot of non-trivial work to be used as a training set, but unless we can take a look at Albert Silver's database, we can't know if the zero approach is better or if someone finding the right set of games could produce the best weights file in a much shorter time than the zero approach would allow.

In all areas I've seen supervised learning has improved over unsupervised one, I'd be greatly surprised if this was the exception, but how to supervise it well remains to be seen.

Nordlandia · Post by **Nordlandia** » Mon Nov 18, 2019 3:40 pm

corres wrote: ↑Mon Nov 18, 2019 9:36 am I think especially in the case of a relatively weak GPU (weaker than RTX 2080 Ti) during endgame Leela is weakened considerably if the time increment is too low (< 1 sec). Because of this it is more correct for Leele using some minutes plus (> 1 sec) / move Short Time Control.
Other note: Using ponder is correct only if the match is PC against PC and incorrect if the match is engine-engine because in the case of engine-engine the ponder disturb the calculation of the other engine.

That is not necessarily true on a chip with many cores. I agree that Alpha Beta vs another Alpha Beta is a direct downgrade in performance. That is not so apparent for neural network against alpha beta since NN need modest claim on CPU resources.

Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!

Re: Fat Fritz destroyed Stockfish!