The new NNUE-net (nn-308..) seems being weaker

corres · Post by **corres** » Sat Sep 05, 2020 9:56 pm

mehmet123 wrote: ↑Sat Sep 05, 2020 7:10 pm
corres wrote: ↑Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)

For you maybe nothing.
But I made this test for me.

Terje · Post by **Terje** » Sat Sep 05, 2020 10:35 pm

corres wrote: ↑Sat Sep 05, 2020 9:54 pm
Terje wrote: ↑Sat Sep 05, 2020 7:36 pm This thread is just spam, mods should remove it.
You think about your own note?
If you do not agree me, prove the opposite making a correct test.
I am curious to read your result!

The correct test has already been posted by syzygy earlier in this thread. If you want to ignore that and use your 100 game test go ahead, but it's of no use to anyone else so keep it to yourself instead of posting it on talkchess.

yurikvelo · Post by **yurikvelo** » Sat Sep 05, 2020 10:39 pm

Widely believed misconcept is that Elo-gainer (new stronger version of something) can only increase number won/drawn cases, while retaining all win/draw cases which previous version could handle.

In reality Elo-gainer STATISTICALLY increase number of winning games DESPITE of getting worse in some game aspects (not winning games which previous weaker version won, loosing games which previous version could handle as draw, not solving positions which previous version solved fast)

mehmet123 · Post by **mehmet123** » Sat Sep 05, 2020 10:44 pm

corres wrote: ↑Sat Sep 05, 2020 9:56 pm
mehmet123 wrote: ↑Sat Sep 05, 2020 7:10 pm
corres wrote: ↑Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
For you maybe nothing.
But I made this test for me.

Do you understand my question? If the score is %75 there is only one explanation. The engine can't use the SV 1739 net.

corres · Post by **corres** » Sat Sep 05, 2020 11:39 pm

Terje wrote: ↑Sat Sep 05, 2020 10:35 pm
corres wrote: ↑Sat Sep 05, 2020 9:54 pm
Terje wrote: ↑Sat Sep 05, 2020 7:36 pm This thread is just spam, mods should remove it.
You think about your own note?
If you do not agree me, prove the opposite making a correct test.
I am curious to read your result!
The correct test has already been posted by syzygy earlier in this thread. If you want to ignore that and use your 100 game test go ahead, but it's of no use to anyone else so keep it to yourself instead of posting it on talkchess.

I decide about what I make post and not you.
You believe in what you want. I do not want to intervene what you doing. So you also spare me from your provocation.

corres · Post by **corres** » Sat Sep 05, 2020 11:47 pm

mehmet123 wrote: ↑Sat Sep 05, 2020 10:44 pm
corres wrote: ↑Sat Sep 05, 2020 9:56 pm
mehmet123 wrote: ↑Sat Sep 05, 2020 7:10 pm
corres wrote: ↑Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
For you maybe nothing.
But I made this test for me.
Do you understand my question? If the score is %75 there is only one explanation. The engine can't use the SV 1739 net.

Maybe you made the test or I?
As I wrote down the test consist of 100 (one hundred) games and it is ended with the result of 3 : 1 (other were draw - obviously) for the nn-82215...net.
Do YOU understand me??
Maybe you make test for %, but I make for points.

corres · Post by **corres** » Sun Sep 06, 2020 12:20 am

yurikvelo wrote: ↑Sat Sep 05, 2020 10:39 pm Widely believed misconcept is that Elo-gainer (new stronger version of something) can only increase number won/drawn cases, while retaining all win/draw cases which previous version could handle.
In reality Elo-gainer STATISTICALLY increase number of winning games DESPITE of getting worse in some game aspects (not winning games which previous weaker version won, loosing games which previous version could handle as draw, not solving positions which previous version solved fast)

You think if you make a test on a machine in which the same engines play against each other and there is only one difference namely in the used NNUE-net (on one side the nn-822... and the other side the nn-308...) the gotten result will not show the power difference between the two net? I think it will show.
I think this post-change is fully superfluous.
Only one note is acceptable: A test consisted of only 100 games is not enough to decide about the power order. In my own writing I referred to this and it is not accident the name of my post:
THE NEW NNUE-NET (nn-308..) S E E M S BEING WEAKER. So this is my subjective opinion and it is independent from "official" result what syzygy saw in. And it is only my decision what net I will use (nn-82215...).
From the side of syzygy it would be more correct if he only would refer to the "official" test result.
But he also like to provoke others.

mehmet123 · Post by **mehmet123** » Sun Sep 06, 2020 12:31 am

Terje wrote: ↑Sat Sep 05, 2020 7:36 pm This thread is just spam, mods should remove it.

I think you are right. No score, no pgn games.

chrisw · Post by **chrisw** » Sun Sep 06, 2020 12:37 am

mehmet123 wrote: ↑Sun Sep 06, 2020 12:31 am
Terje wrote: ↑Sat Sep 05, 2020 7:36 pm This thread is just spam, mods should remove it.
I think you are right. No score, no pgn games.

Someone tell me this is just a dream

mehmet123 · Post by **mehmet123** » Sun Sep 06, 2020 12:46 am

I've seen more extreme examples of this. A few years ago one person strongly claimed that Critter 1.6 was more stronger than Houdini 2.0 in a chess forum.
He made only two matches test and in his tests Critter won the 2 games. No matter how much evidence I show him he is to accept none. Neither my tests nor the rating lists did not matter for him.

The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker

Re: The new NNUE-net (nn-308..) seems being weaker