Page 1 of 1

SPCC: Testrun of SFnnue sv200728_1104 finished

Posted: Thu Jul 30, 2020 2:45 pm
by pohl4711
AB-testrun of SFnnue sv200728_1104 (nodchip avx2-compile) finished. Another huge Elo-gain !!!

https://www.sp-cc.de

Re: SPCC: Testrun of SFnnue sv200728_1104 finished

Posted: Thu Jul 30, 2020 2:56 pm
by pohl4711

Code: Select all

Individual statistics:

1 SFnnue sv200728_1104   : 3648 7000 (+4475,=2425,-100), 81.3 %

Slow Chess 2.2 popc      : 1000 (+722,=268,- 10), 85.6 %
Ethereal 12.25 pext      : 1000 (+715,=280,-  5), 85.5 %
Komodo 14 bmi2           : 1000 (+548,=435,- 17), 76.5 %
Stockfish 11 200118      : 1000 (+357,=597,- 46), 65.5 %
Xiphos 0.6 bmi2          : 1000 (+759,=236,-  5), 87.7 %
Fire 7.1 popc            : 1000 (+769,=227,-  4), 88.3 %
Houdini 6 pext           : 1000 (+605,=382,- 13), 79.6 %
Only 100 losses out of 7000 games versus 7 strong AB-engines...
Classical, official Stockfish is released, when around +50 Elo are reached, compared to the latest official release. So SFnnue now plays nearly at the level of Stockfish 13 (!)

Re: SPCC: Testrun of SFnnue sv200728_1104 finished

Posted: Thu Jul 30, 2020 4:06 pm
by marsell
thanks for the test, Stockfish nnue is great.

Re: SPCC: Testrun of SFnnue sv200728_1104 finished

Posted: Thu Jul 30, 2020 4:16 pm
by mehmet123
Stockfish NNUE SV 200728_1104 vs Stockfish 11 :1000 (+357,=597,- 46), 65.5 %

+112 elo difference (elostat). Really great

Re: SPCC: Testrun of SFnnue sv200728_1104 finished

Posted: Thu Jul 30, 2020 6:20 pm
by Gary Internet
This has to be the end of an era that people have talked about. This is a real step change in strength gain for an engine that runs on CPU only. Stefan's words say more than mine could:
AB-testrun of SFnnue sv200728_1104 (nodchip avx2-compile) finished. Another huge Elo-gain: +18 Elo to SFnnue sv200724_0123, +61 Elo to Stockfish 200717 (latest SF-dev) and +94 Elo to Stockfish 11 !!!
Even if NNUE stopped gaining Elo forever by mid August 2020, it would probably take Stockfish Dev about 3 years to catch up. At the moment NNUE seems to making months of progress in just days, and years of progress in a week or two.

EDIT: As Mehmet says, if you just look at the head-to-head results against SF 11 and calculate Elo from that, 112 Elo is light years ahead of where SF Dev would be in the same head-to-head match right now. Yes, it's "only" 1,000 games, but when the winning margin is so great, you can't seriously expect that if they played another 1,000 games, the scores would balance out to a draw. Doesn't look like SF11 would stand a chance.