TalkChess.com

Posted: **Tue Aug 04, 2020 11:48 pm**

Milos wrote: ↑Tue Aug 04, 2020 11:40 pm
How bad is NNUE eval really one sees when testing SF depth 5 with minimum eval (material+mobility+PST) vs SF-NNUE depth 1. It's a slaughter house.

I’m shocked! Shocked! Thing A at depth 5 slaughtered thing B at depth 1? Hard to believe.

Posted: **Wed Aug 05, 2020 12:01 am**

dkappe wrote: ↑Tue Aug 04, 2020 11:48 pm
Milos wrote: ↑Tue Aug 04, 2020 11:40 pm
How bad is NNUE eval really one sees when testing SF depth 5 with minimum eval (material+mobility+PST) vs SF-NNUE depth 1. It's a slaughter house.
I’m shocked! Shocked! Thing A at depth 5 slaughtered thing B at depth 1? Hard to believe.

Well thing B having approximate eval of depth 12 or 16 or 18 or whatever stored in 20MBs of data. What would be the score if thing B had 20MB book instead?
All you actually managed to demonstrate with you dull essay about NNUE analysis is that NNUE is quite a crappy book. Not that it's not.

Posted: **Wed Aug 05, 2020 12:06 am**

Milos wrote: ↑Wed Aug 05, 2020 12:01 am
dkappe wrote: ↑Tue Aug 04, 2020 11:48 pm
Milos wrote: ↑Tue Aug 04, 2020 11:40 pm
How bad is NNUE eval really one sees when testing SF depth 5 with minimum eval (material+mobility+PST) vs SF-NNUE depth 1. It's a slaughter house.
I’m shocked! Shocked! Thing A at depth 5 slaughtered thing B at depth 1? Hard to believe.
Well thing B having approximate eval of depth 12 or 16 or 18 or whatever stored in 20MBs of data. What would be the score if thing B had 20MB book instead?
All you are actually managed to demonstrate with you dull essay about NNUE analysis is that NNUE is quite a crappy book. Not that it's not.

Well, do the following test: 1 & 5 you already have, now do 6 & 10, then 11 & 15. See a pattern?

Posted: **Wed Aug 05, 2020 1:32 am**

dkappe wrote: ↑Wed Aug 05, 2020 12:06 am
Milos wrote: ↑Wed Aug 05, 2020 12:01 am
dkappe wrote: ↑Tue Aug 04, 2020 11:48 pm
Milos wrote: ↑Tue Aug 04, 2020 11:40 pm
How bad is NNUE eval really one sees when testing SF depth 5 with minimum eval (material+mobility+PST) vs SF-NNUE depth 1. It's a slaughter house.
I’m shocked! Shocked! Thing A at depth 5 slaughtered thing B at depth 1? Hard to believe.
Well thing B having approximate eval of depth 12 or 16 or 18 or whatever stored in 20MBs of data. What would be the score if thing B had 20MB book instead?
All you are actually managed to demonstrate with you dull essay about NNUE analysis is that NNUE is quite a crappy book. Not that it's not.
Well, do the following test: 1 & 5 you already have, now do 6 & 10, then 11 & 15. See a pattern?

Difference would actually slightly reduce from 1vs5 to 6vs10 and then it would increase back to original one. But again, this only tells us about the search it tell us nothing about evaluation.
Regarding the book, impact is significantly reduced once you go into higher depths. But that is only the case with general books like Cerebellum. With targeted book that is ofc not the case. But my point is that using general book, generated but the engine itself is not much different (fairness-wise) to using interal eval trained by the same engine.

Posted: **Wed Aug 05, 2020 1:48 am**

Milos wrote: ↑Wed Aug 05, 2020 1:32 am
dkappe wrote: ↑Wed Aug 05, 2020 12:06 am
Well, do the following test: 1 & 5 you already have, now do 6 & 10, then 11 & 15. See a pattern?
Difference would actually slightly reduce from 1vs5 to 6vs10 and then it would increase back to original one. But again, this only tells us about the search it tell us nothing about evaluation.
Regarding the book, impact is significantly reduced once you go into higher depths. But that is only the case with general books like Cerebellum. With targeted book that is ofc not the case. But my point is that using general book, generated but the engine itself is not much different (fairness-wise) to using interal eval trained by the same engine.

Have you actually run the test, or are you just speculating?

Posted: **Wed Aug 05, 2020 9:08 am**

Milos wrote: ↑Wed Aug 05, 2020 12:01 am
dkappe wrote: ↑Tue Aug 04, 2020 11:48 pm
Milos wrote: ↑Tue Aug 04, 2020 11:40 pm
How bad is NNUE eval really one sees when testing SF depth 5 with minimum eval (material+mobility+PST) vs SF-NNUE depth 1. It's a slaughter house.
I’m shocked! Shocked! Thing A at depth 5 slaughtered thing B at depth 1? Hard to believe.
Well thing B having approximate eval of depth 12 or 16 or 18 or whatever stored in 20MBs of data. What would be the score if thing B had 20MB book instead?
All you actually managed to demonstrate with you dull essay about NNUE analysis is that NNUE is quite a crappy book. Not that it's not.

Isn’t that true of literally every evaluation function? Let your evaluation function be “return rand();” and you can use it to generate very crappy opening evals, too. Yet nobody would consider this an opening book.

The evaluation definitely doesn’t provide any sort of “good“ opening book without search, since its opening preference varies with depth. The combination if search play evaluation might encode a book by some silly definition, but the argument can be made that that’s also a side effect of SPSA tuning of traditional Stockfish. SF’s search midgame evaluation parameters have been tuned to play out-of-book as well as possible.

This ends up being an argument over the semantics of “opening book,” where you try to stretch “opening book” into a definition that nobody would otherwise use.

Posted: **Wed Aug 05, 2020 9:17 am**

Twipply wrote: ↑Tue Aug 04, 2020 11:48 pm
dkappe wrote: ↑Tue Aug 04, 2020 11:32 pm I’m sorry I hurt your feelings.
I reacted strongly not because of feelings, but because I think this topic has basically invalidated some of the more recent TCEC Superfinal results, and the admins there should stop ignoring it. However, even if my feelings were hurt, that would not invalidate what I've said nor would it validate your post.

dkappe wrote: ↑Tue Aug 04, 2020 11:32 pm (BTW, I found your engine to be an excellent sparring partner during the development of a0lite.)
Thanks. I'm glad it worked well for you.

Superfinal results haven’t been statically significant for a while, it’s on hardware that nobody would consider accessible, the engines constantly update and the results could theoretically be outdated not long after the sufi begins, etc. Getting “valid” superfinal results is incredibly challenging. Attributing that challenge solely to NN opening behavior is absurd. It’s just a drop in the ocean of other issues.

Posted: **Wed Aug 05, 2020 11:57 am**

Dann Corbit wrote: ↑Tue Aug 04, 2020 7:59 pm I suspect that NN approaches work very well for the initial, quiet part of the game.

The initial part of the game is not quiet. You are conflating two separate suspicions. (The first suspicion probably has a lot more evidence for it than the second.)

Posted: **Wed Aug 05, 2020 1:16 pm**

Jouni wrote: ↑Tue Aug 04, 2020 7:26 pm Is this SF NN almost like 20 MB book?

It's quite the opposite! NNUE isn't learning what opening moves are good. It's learning what moves are good against the openings it plays.

Figures that if you trained a net using the Cerebellum library it would not end playing like it, it'd end up playing the best moves that defeated it. An Anti-brainfish net. But it's unknown if it'd be any good against other entities.

Posted: **Wed Aug 05, 2020 6:31 pm**

cucumber wrote: ↑Wed Aug 05, 2020 9:17 am
Twipply wrote: ↑Tue Aug 04, 2020 11:48 pm I reacted strongly not because of feelings, but because I think this topic has basically invalidated some of the more recent TCEC Superfinal results, and the admins there should stop ignoring it.
Superfinal results haven’t been statically significant for a while, it’s on hardware that nobody would consider accessible, the engines constantly update and the results could theoretically be outdated not long after the sufi begins, etc. Getting “valid” superfinal results is incredibly challenging. Attributing that challenge solely to NN opening behavior is absurd. It’s just a drop in the ocean of other issues.

My mistake, I didn't mean to suggest that the "NN=book" idea is my only issue with the validity of the TCEC Superfinals. When I said they're invalid I meant in the sense that if it's not a fair fight then I don't care about the result - not unless the underdog manages to win despite the handicap. Of course, any engine author should realise that the Superfinal results are not likely to be statistically significant, myself included.

TalkChess.com

Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?

Re: Is this SF NN almost like 20 MB book?