The new NNUE-net (nn-308..) seems being weaker

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

The new NNUE-net (nn-308..) seems being weaker

Post by corres »

I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
syzygy
Posts: 5974
Joined: Tue Feb 28, 2012 11:56 pm

Re: The new NNUE-net (nn-308..) seems being weaker

Post by syzygy »

corres wrote: Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
Someone else made a test and got these results:

Code: Select all

STC:
LLR: 2.98 (-2.94,2.94) {-0.25,1.25}
Total: 108328 W: 14048 L: 13719 D: 80561
Ptnml(0-2): 842, 10039, 32062, 10390, 831
https://tests.stockfishchess.org/tests/view/5f50e053ba100690c5cc5f00

LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 13872 W: 1059 L: 890 D: 11923
Ptnml(0-2): 30, 724, 5270, 871, 41
https://tests.stockfishchess.org/tests/view/5f51821fba100690c5cc5f36
Which would be more reliable?
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: The new NNUE-net (nn-308..) seems being weaker

Post by corres »

syzygy wrote: Sat Sep 05, 2020 3:45 pm
corres wrote: Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
Someone else made a test and got these results:

Code: Select all

STC:
LLR: 2.98 (-2.94,2.94) {-0.25,1.25}
Total: 108328 W: 14048 L: 13719 D: 80561
Ptnml(0-2): 842, 10039, 32062, 10390, 831
https://tests.stockfishchess.org/tests/view/5f50e053ba100690c5cc5f00

LTC:
LLR: 2.96 (-2.94,2.94) {0.25,1.25}
Total: 13872 W: 1059 L: 890 D: 11923
Ptnml(0-2): 30, 724, 5270, 871, 41
https://tests.stockfishchess.org/tests/view/5f51821fba100690c5cc5f36
Which would be more reliable?
For me my own test, because the circumstances were very different, mainly in the used opening positions, the TC and the number of cores. The tests of Stockfish are rather inhomogeneous. I used the same machine with the same settings for tests and for analyze games too.
TommyTC
Posts: 38
Joined: Thu Mar 30, 2017 8:52 am

Re: The new NNUE-net (nn-308..) seems being weaker

Post by TommyTC »

Corres,

You have incorrectly answered a rhetorical question. :shock:
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: The new NNUE-net (nn-308..) seems being weaker

Post by corres »

TommyTC wrote: Sat Sep 05, 2020 6:40 pm Corres,

You have incorrectly answered a rhetorical question. :shock:
I am not a polcorrect man...
I used to say the sincere.
mehmet123
Posts: 699
Joined: Sun Jan 26, 2020 10:38 pm
Location: Turkey
Full name: Mehmet Karaman

Re: The new NNUE-net (nn-308..) seems being weaker

Post by mehmet123 »

corres wrote: Sat Sep 05, 2020 3:15 pm I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
https://PrivateLadyEscorts.com - Live Local Dating - No Verify - Anonymous Casual Dating - Chat Local Singles
Alayan
Posts: 550
Joined: Tue Nov 19, 2019 8:48 pm
Full name: Alayan Feh

Re: The new NNUE-net (nn-308..) seems being weaker

Post by Alayan »

Your test is insignificant and your conclusions are laughable.

If Stockfish devs used the same principle as you to judge whether a new patch is stronger or weaker, Stockfish would be barely stronger than Glaurung if at all.
Terje
Posts: 347
Joined: Tue Nov 19, 2019 4:34 am
Location: https://github.com/TerjeKir/weiss
Full name: Terje Kirstihagen

Re: The new NNUE-net (nn-308..) seems being weaker

Post by Terje »

This thread is just spam, mods should remove it.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: The new NNUE-net (nn-308..) seems being weaker

Post by corres »

Alayan wrote: Sat Sep 05, 2020 7:17 pm Your test is insignificant and your conclusions are laughable.

If Stockfish devs used the same principle as you to judge whether a new patch is stronger or weaker, Stockfish would be barely stronger than Glaurung if at all.

Go on and laugh!
And there is no any obstacle that you make tests for the developers of Stockfish.
I make test to get information for my playing and the valuing of this may be rather subjective.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: The new NNUE-net (nn-308..) seems being weaker

Post by corres »

Terje wrote: Sat Sep 05, 2020 7:36 pm This thread is just spam, mods should remove it.
You think about your own note?
If you do not agree me, prove the opposite making a correct test.
I am curious to read your result!