Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Gary Internet
Posts: 60
Joined: Thu Jan 04, 2018 7:09 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Gary Internet »

You can't use TCEC as an indicator of anything, because the sample size is tiny. Plus when strong engines use 176 threads and 64 GB to 128 GB hash and play at 5400+10 time control, even the engines that are weaker (like Ethereal or Fire) but still very strong, have way more chance of seeing deeper, wider, better and making far fewer errors than they would if the tournament was played at bullet time controls. The draw rate soars for the entire Division.

Be patient, wait until SF NNUE has played all 42 games in the Premier Division. Remember that all the versions of all the engines playing are the strongest versions that have ever played in TCEC. It's true each season, but it's worth reiterating. It's now Ethereal 12.43 not 12.00, and Allie 0.8 dev, not 0.6 dev, Komodo 2576 not 2301 etc. All of this adds to the increasing draw rate because although everyone is getting stronger, the gap between many engines is closing, even if SFNNUE has recently pulled away again.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by mwyoung »

Gary Internet wrote: Wed Sep 02, 2020 6:45 pm You can't use TCEC as an indicator of anything, because the sample size is tiny. Plus when strong engines use 176 threads and 64 GB to 128 GB hash and play at 5400+10 time control, even the engines that are weaker (like Ethereal or Fire) but still very strong, have way more chance of seeing deeper, wider, better and making far fewer errors than they would if the tournament was played at bullet time controls. The draw rate soars for the entire Division.

Be patient, wait until SF NNUE has played all 42 games in the Premier Division. Remember that all the versions of all the engines playing are the strongest versions that have ever played in TCEC. It's true each season, but it's worth reiterating. It's now Ethereal 12.43 not 12.00, and Allie 0.8 dev, not 0.6 dev, Komodo 2576 not 2301 etc. All of this adds to the increasing draw rate because although everyone is getting stronger, the gap between many engines is closing, even if SFNNUE has recently pulled away again.
It is interesting to note. That it does not take 176 threads and a time control of 5400+10 for the draw rate to soars. And this is what makes 1 tier testing at ultra fast time controls mote. This is just the latest example...

3m+2s Results 32 threads.

Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. SF+NNUE PO 290720 x64 avx2     60      18      42       0    39.0   100.0   107.5
  2. Stockfish 260820               60      10      50       0    35.0    99.9    58.5
  3. Stockfish 240820               60       8      42      10    29.0    31.9   -11.6
  4. Lc0 v0.26.1                    60       6      51       3    31.5    84.1    17.4
  5. Komodo 14 64-bit               60       4      42      14    25.0     0.9   -58.5
  6. Ethereal 12.25 (POPCNT)        60       0      41      19    20.5     0.0  -113.9

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games            1            2            3            4            5            6
  1. SF+NNUE PO 290720 x64 avx2      39.0      60            x ============ =111=1=1==1= ========1=== ==111==1===1 1====11=1=11
  2. Stockfish 260820                35.0      60 ============            x ===11==1==== 1=========== =1=11==1==== 1===1=======
  3. Stockfish 240820                29.0      60 =000=0=0==0= ===00==0====            x =1====0===== ==1=====11== 1==11=====1=
  4. Lc0 v0.26.1                     31.5      60 ========0=== 0=========== =0====1=====            x =1========1= ==1==1====1=
  5. Komodo 14 64-bit                25.0      60 ==000==0===0 =0=00==0==== ==0=====00== =0========0=            x ==11===11===
  6. Ethereal 12.25 (POPCNT)         20.5      60 0====00=0=00 0===0======= 0==00=====0= ==0==0====0= ==00===00===            x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. SF+NNUE PO 290720 x64 avx2    117795K    24694883     39.9      4.8     54.9    261.8
  2. Stockfish 260820              108053K    23706152     40.6      4.6     59.3    270.4
  3. Stockfish 240820              183725K    40935039     41.8      4.5     62.1    278.9
  4. Lc0 v0.26.1                      100K       21776      9.9      4.6     61.8    285.0
  5. Komodo 14 64-bit              181390K    38325371     36.5      4.7     60.6    286.8
  6. Ethereal 12.25 (POPCNT)       194638K    41790221     29.9      4.7     61.1    284.7
     all ---                       128047K    28297781     32.9      4.6     60.0    277.9

Tournament finished! Elapsed: 1d 4:04:16
15m+15s Results 32 threads.

Code: Select all

Result:
----------------------------------------------------------------------------------------
  #  name                        games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 260820               40       5      35       0    22.5    98.7    43.7
  2. SF+NNUE PO 290720 x64 avx2     40       4      36       0    22.0    97.7    34.9
  3. Lc0 v0.26.1                    40       2      38       0    21.0    92.1    17.4
  4. Stockfish 240820               40       1      36       3    19.0    15.9   -17.4
  5. Komodo 14 64-bit               40       1      35       4    18.5     9.0   -26.1
  6. Ethereal 12.25 (POPCNT)        40       0      34       6    17.0     0.7   -52.5

Cross table:
----------------------------------------------------------------------------------------
  #  name                           score   games         1         2         3         4         5         6
  1. Stockfish 260820                22.5      40         x  ========  ========  =======1  1=======  1=1===1=
  2. SF+NNUE PO 290720 x64 avx2      22.0      40  ========         x  ========  ======1=  ===11===  =1======
  3. Lc0 v0.26.1                     21.0      40  ========  ========         x  ==1=====  =====1==  ========
  4. Stockfish 240820                19.0      40  =======0  ======0=  ==0=====         x  ========  ======1=
  5. Komodo 14 64-bit                18.5      40  0=======  ===00===  =====0==  ========         x  =====1==
  6. Ethereal 12.25 (POPCNT)         17.0      40  0=0===0=  =0======  ========  ======0=  =====0==         x

Tech:
----------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                          nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 260820              636654K    22885705     48.4     27.8     61.1   1699.0
  2. SF+NNUE PO 290720 x64 avx2    611621K    22756666     51.8     26.9     64.1   1722.8
  3. Lc0 v0.26.1                      820K       29309     12.5     28.0     63.1   1766.4
  4. Stockfish 240820             1084516K    40535112     54.3     26.8     67.8   1815.3
  5. Komodo 14 64-bit             1062519K    38130495     45.0     27.9     62.1   1730.4
  6. Ethereal 12.25 (POPCNT)      1132625K    40887572     36.9     27.7     62.5   1730.6
     all ---                       739922K    27565962     41.6     27.5     63.5   1744.1
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Dann Corbit »

We get awfully excited about one win or one loss.
I don't think we really know how strong any of these engines are for sure, though we have good guesses.
For instance, my guess is that SF will win the tournament, because it is the strongest engine.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Vinvin »

Chessqueen wrote: Wed Sep 02, 2020 5:16 am https://tcec-chess.com/live.html
After 10k games, no doubt NNUE is stronger.
But it seems about equal on long times control (with a lot of cores).

When the analysis become longer, the advantage seems less than on short time control.
Chessqueen
Posts: 5582
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Chessqueen »

Dann Corbit wrote: Wed Sep 02, 2020 8:42 pm We get awfully excited about one win or one loss.
I don't think we really know how strong any of these engines are for sure, though we have good guesses.
For instance, my guess is that SF will win the tournament, because it is the strongest engine.
At the very end it will be LCZero Vs Stockfish NNUE,but I predict a very close encounter of the 3rd kind, LCZero from Planet X Vs StockFish NNUE from Planer Earth :roll:
Do NOT worry and be happy, we all live a short life :roll:
Dann Corbit
Posts: 12540
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Dann Corbit »

Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
Chessqueen
Posts: 5582
Joined: Wed Sep 05, 2018 2:16 am
Location: Moving
Full name: Jorge Picado

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Chessqueen »

Dann Corbit wrote: Thu Sep 03, 2020 1:55 am Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever :roll:
https://tcec-chess.com/live.html
Do NOT worry and be happy, we all live a short life :roll:
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by mwyoung »

Chessqueen wrote: Sat Sep 05, 2020 5:21 am
Dann Corbit wrote: Thu Sep 03, 2020 1:55 am Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever :roll:
https://tcec-chess.com/live.html
I agree. I just played 200 games with Stockfish 12 Vs Lc0 26.2. Stockfish 12 won by only 24 Elo in 200 games at 3m+2s. And in testing. We can see how badly Stockfish NNUE has scaled in past testing. At longer time controls.

Both are the best chess engines, and the winner may only be decided by hardware and time controls.

The sprinter Stockfish 12 vs. the marathon runner Lc0. Who wins the race. May depend on the distance of the race!


Lc0 is clearly improving faster then Stockfish at this point in time. Even at 3m+2s time controls vs past matches at the same time controls.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12    200      16     182       2   107.0   100.0    24.4
  2. Lc0 v0.26.2     200       2     182      16    93.0     0.0   -24.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12     107.0     200                                                                                                                                                                                                        x =====1==1===1====================1======1========11========================1==========================1============================================1====1===1=========1==0================1===1=1==0====
  2. Lc0 v0.26.2       93.0     200 =====0==0===0====================0======0========00========================0==========================0============================================0====0===0=========0==1================0===0=0==1====                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12    125173K    26565996     42.5      4.7     54.1    255.1
  2. Lc0 v0.26.2        101K       20342     10.0      4.9     54.1    267.2
     all ---          61216K    12984844     26.3      4.8     54.1    261.2
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Werewolf
Posts: 1796
Joined: Thu Sep 18, 2008 10:24 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Werewolf »

Because the Neural Net in Lc0 is bigger than the one in SF NNUE, do we have any examples of where Lc0 evaluates a position better?
I'm thinking of positions where a deep AB search won't help much, such as blocked positions.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Laskos »

Werewolf wrote: Sat Sep 05, 2020 7:57 pm Because the Neural Net in Lc0 is bigger than the one in SF NNUE, do we have any examples of where Lc0 evaluates a position better?
I'm thinking of positions where a deep AB search won't help much, such as blocked positions.

Openings. Positionally in the openings, it seems Lc0 > SF NNUE > SF AB > other AB.