Date: 07/24/20 : 12:06:02
4000 game(s) loaded
Rank Name Rating Δ + - # Σ Σ% W L D W% =% OppR
---------------------------------------------------------------------------------------------------------
1 sf1134 3501 0.0 7 7 4000 2015.5 50.4 939 908 2153 23.5 53.8 3499
2 sf1732 3499 2.6 7 7 4000 1984.5 49.6 908 939 2153 22.7 53.8 3501
---------------------------------------------------------------------------------------------------------
Σ = total score, 1 point for win, 1/2 point for draw
LOS:
sf sf
sf1134 76
sf1732 23
4000 game(s) loaded
sf1134 totals:
games are mostly 60+0.6 ( maybe all - did not check)
overall they are pretty tight and anyone of the first 5 listed below in the summary could be the true "topdog" ...progress has been slow of late.
only an Elo spread from top to bottom
You posted a link for a 20Mb for a file to a site that downloads at 200Kb.sec - you can do better than that - google drive, dropbox - whatever. I tried to download- too painful for me - sorry.
You posted a link for a 20Mb for a file to a site that downloads at 200Kb.sec - you can do better than that - google drive, dropbox - whatever. I tried to download- too painful for me - sorry.
Sorry, it was the link given on the discord channel. I reuploaded it on another server: https://gofile.io/d/ry7AuA
Strange that both Kai and Ed got a regression with 1134.
Not strange at at really - happens all the time - engines can run into an a lucky or unlucky run. 1134 hasn't lost since it came out for ME - , yomv and ymmv - I don't worry about ut , I'm sure they dont worry about it and you shouldn't worry about it either. 1134 will lose sat some point so eventually world peace and harmony will be restored. ;>)
Furthermore, if a hundred people test a thousand nets against one opponent, even if the two engines are “the same”, five hundred of those nets are going to show up stronger than.
The matches are short and headline results are, well, whatever, you get the idea.
You have a point, but I was under the (perhaps wrong) impression that the matches weren't so short.
You posted a link for a 20Mb for a file to a site that downloads at 200Kb.sec - you can do better than that - google drive, dropbox - whatever. I tried to download- too painful for me - sorry.
Sorry, it was the link given on the discord channel. I reuploaded it on another server: https://gofile.io/d/ry7AuA
I ran 2344 vs 2141 overnight. TC: 3m+2s, 92 threads.
Score of StockfishNNUE 2344 vs StockfishNNUE 20200722-2141: 14 - 5 - 33 [0.587]
... StockfishNNUE 2344 playing White: 12 - 0 - 14 [0.731] 26
... StockfishNNUE 2344 playing Black: 2 - 5 - 19 [0.442] 26
... White vs Black: 17 - 2 - 33 [0.644] 52
Elo difference: 60.7 +/- 56.8, LOS: 98.1 %, DrawRatio: 63.5 %
52 of 100 games finished.
Just did a quick test of 2344 against K14 using Nunn1 openings, G10s+0.2s - result +10 =9 -1! The loss was on the black side of a Winawer French in 57 moves. SFnnue won the reverse game in 29 moves.
Here is a great game by SFnnue - note the final position.
I ran 2344 vs 2141 overnight. TC: 3m+2s, 92 threads.
Score of StockfishNNUE 2344 vs StockfishNNUE 20200722-2141: 14 - 5 - 33 [0.587]
... StockfishNNUE 2344 playing White: 12 - 0 - 14 [0.731] 26
... StockfishNNUE 2344 playing Black: 2 - 5 - 19 [0.442] 26
... White vs Black: 17 - 2 - 33 [0.644] 52
Elo difference: 60.7 +/- 56.8, LOS: 98.1 %, DrawRatio: 63.5 %
52 of 100 games finished.
Too few games to say anything with high confidence. Not even clear that 2344 is stronger than 2141, cherry picked LOS of 98% doesn't qualify as a stopping rule. 99.9% or higher are needed when cherry picking to have some confidence in superiority, and even higher for small number of games.