corres wrote: ↑Sat Sep 05, 2020 3:15 pm
I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
For you maybe nothing.
But I made this test for me.
Terje wrote: ↑Sat Sep 05, 2020 7:36 pm
This thread is just spam, mods should remove it.
You think about your own note?
If you do not agree me, prove the opposite making a correct test.
I am curious to read your result!
The correct test has already been posted by syzygy earlier in this thread. If you want to ignore that and use your 100 game test go ahead, but it's of no use to anyone else so keep it to yourself instead of posting it on talkchess.
Widely believed misconcept is that Elo-gainer (new stronger version of something) can only increase number won/drawn cases, while retaining all win/draw cases which previous version could handle.
In reality Elo-gainer STATISTICALLY increase number of winning games DESPITE of getting worse in some game aspects (not winning games which previous weaker version won, loosing games which previous version could handle as draw, not solving positions which previous version solved fast)
corres wrote: ↑Sat Sep 05, 2020 3:15 pm
I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
For you maybe nothing.
But I made this test for me.
Do you understand my question? If the score is %75 there is only one explanation. The engine can't use the SV 1739 net.
Terje wrote: ↑Sat Sep 05, 2020 7:36 pm
This thread is just spam, mods should remove it.
You think about your own note?
If you do not agree me, prove the opposite making a correct test.
I am curious to read your result!
The correct test has already been posted by syzygy earlier in this thread. If you want to ignore that and use your 100 game test go ahead, but it's of no use to anyone else so keep it to yourself instead of posting it on talkchess.
I decide about what I make post and not you.
You believe in what you want. I do not want to intervene what you doing. So you also spare me from your provocation.
corres wrote: ↑Sat Sep 05, 2020 3:15 pm
I made a short (100 games test (TC 1 min + 2 sec/ move) between SF+NNUE with nn-82215..) and SF+NNUE with nn-308..) and I got a result of 3 : 1 for nn-82215..
The number of games are relative few, but watching the games the tendency is obvious.
What's the match result. Is 3:1 means %75 score against SV 1739 /nn-308d71810dff.nnue)
For you maybe nothing.
But I made this test for me.
Do you understand my question? If the score is %75 there is only one explanation. The engine can't use the SV 1739 net.
Maybe you made the test or I?
As I wrote down the test consist of 100 (one hundred) games and it is ended with the result of 3 : 1 (other were draw - obviously) for the nn-82215...net.
Do YOU understand me??
Maybe you make test for %, but I make for points.
yurikvelo wrote: ↑Sat Sep 05, 2020 10:39 pm
Widely believed misconcept is that Elo-gainer (new stronger version of something) can only increase number won/drawn cases, while retaining all win/draw cases which previous version could handle.
In reality Elo-gainer STATISTICALLY increase number of winning games DESPITE of getting worse in some game aspects (not winning games which previous weaker version won, loosing games which previous version could handle as draw, not solving positions which previous version solved fast)
You think if you make a test on a machine in which the same engines play against each other and there is only one difference namely in the used NNUE-net (on one side the nn-822... and the other side the nn-308...) the gotten result will not show the power difference between the two net? I think it will show.
I think this post-change is fully superfluous.
Only one note is acceptable: A test consisted of only 100 games is not enough to decide about the power order. In my own writing I referred to this and it is not accident the name of my post:
THE NEW NNUE-NET (nn-308..) S E E M S BEING WEAKER. So this is my subjective opinion and it is independent from "official" result what syzygy saw in. And it is only my decision what net I will use (nn-82215...).
From the side of syzygy it would be more correct if he only would refer to the "official" test result.
But he also like to provoke others.
I've seen more extreme examples of this. A few years ago one person strongly claimed that Critter 1.6 was more stronger than Houdini 2.0 in a chess forum.
He made only two matches test and in his tests Critter won the 2 games. No matter how much evidence I show him he is to accept none. Neither my tests nor the rating lists did not matter for him.