CCRL flawed testing : SF12 above SF12 8CPU

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Cherry on Top.

Post by mwyoung »

mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Cherry on Top.

Post by Laskos »

mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Cherry on Top.

Post by mwyoung »

Laskos wrote: Sun Oct 11, 2020 5:56 pm
mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
I am all for more testing. Because "Houston, we've had a problem"

And I used your words exactly "Lasko's Law" on why you said that CCRL are... ""Yes, underperformance of 8CPU SF12 is statistically significant"-Lasko

and I asked WHY?

And then came Lasko's Law with no Data! Agreeing with the Flawed testing of CCRL. And this was your PROOF!

Lasko's Law----"What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model." :lol:

As I told a member of CCRL about this thread....."As always on CCC, too much speculation and not enough data."
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Cherry on Top.

Post by Laskos »

mwyoung wrote: Sun Oct 11, 2020 6:09 pm
Laskos wrote: Sun Oct 11, 2020 5:56 pm
mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
I am all for more testing. Because "Houston, we've had a problem"

And I used your words exactly "Lasko's Law" on why you said that CCRL are... ""Yes, underperformance of 8CPU SF12 is statistically significant"-Lasko

and I asked WHY?

And then came Lasko's Law with no Data! Agreeing with the Flawed testing of CCRL. And this was your PROOF!

Lasko's Law----"What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model." :lol:

As I told a member of CCRL about this thread....."As always on CCC, too much speculation and not enough data."
Yes, I agree with my statement "Yes, underperformance of 8CPU SF12 is statistically significant". For the rest with doublings, I used that as an estimate for CCRL blitz conditions and usual testing. When they populate the list with 8 cored engines (fairly tested) you will see what I am talking about.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Cherry on Top.

Post by mwyoung »

Laskos wrote: Sun Oct 11, 2020 6:56 pm
mwyoung wrote: Sun Oct 11, 2020 6:09 pm
Laskos wrote: Sun Oct 11, 2020 5:56 pm
mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
I am all for more testing. Because "Houston, we've had a problem"

And I used your words exactly "Lasko's Law" on why you said that CCRL are... ""Yes, underperformance of 8CPU SF12 is statistically significant"-Lasko

and I asked WHY?

And then came Lasko's Law with no Data! Agreeing with the Flawed testing of CCRL. And this was your PROOF!

Lasko's Law----"What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model." :lol:

As I told a member of CCRL about this thread....."As always on CCC, too much speculation and not enough data."
Yes, I agree with my statement "Yes, underperformance of 8CPU SF12 is statistically significant". For the rest with doublings, I used that as an estimate for CCRL blitz conditions and usual testing. When they populate the list with 8 cored engines (fairly tested) you will see what I am talking about.
We will see, but your PROOF has been busted. And for what every the cause of SF 12 and CCRL testing. I think we can now both agree "CCRL flawed testing : SF12 above SF12 8CPU" is clearly unfair. In the light of the data we have.

I think CCRL takes pride in their work. Agree or disagree with their methods of testing.

Or you would not see this by CCRL.

Modern Times-"It is doing my head in for sure."....."The somewhat unstructured and ad-hoc nature of our testing doesn't help in this situation either, although with enough games that usually eventually resolves itself. To get to the bottom of it you need to do some structured testing with exactly the same opponents, same hardware and testing conditions - which you and others have done or are doing."
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Cherry on Top.

Post by Laskos »

mwyoung wrote: Sun Oct 11, 2020 7:20 pm
Laskos wrote: Sun Oct 11, 2020 6:56 pm
mwyoung wrote: Sun Oct 11, 2020 6:09 pm
Laskos wrote: Sun Oct 11, 2020 5:56 pm
mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
I am all for more testing. Because "Houston, we've had a problem"

And I used your words exactly "Lasko's Law" on why you said that CCRL are... ""Yes, underperformance of 8CPU SF12 is statistically significant"-Lasko

and I asked WHY?

And then came Lasko's Law with no Data! Agreeing with the Flawed testing of CCRL. And this was your PROOF!

Lasko's Law----"What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model." :lol:

As I told a member of CCRL about this thread....."As always on CCC, too much speculation and not enough data."
Yes, I agree with my statement "Yes, underperformance of 8CPU SF12 is statistically significant". For the rest with doublings, I used that as an estimate for CCRL blitz conditions and usual testing. When they populate the list with 8 cored engines (fairly tested) you will see what I am talking about.
We will see, but your PROOF has been busted. And for what every the cause of SF 12 and CCRL testing. I think we can now both agree "CCRL flawed testing : SF12 above SF12 8CPU" is clearly unfair. In the light of the data we have.

I think CCRL takes pride in their work. Agree or disagree with their methods of testing.

Or you would not see this by CCRL.

Modern Times-"It is doing my head in for sure."....."The somewhat unstructured and ad-hoc nature of our testing doesn't help in this situation either, although with enough games that usually eventually resolves itself. To get to the bottom of it you need to do some structured testing with exactly the same opponents, same hardware and testing conditions - which you and others have done or are doing."

It could be that SF NNUE scales badly to 8 cores, I haven't ruled that out. Maybe you are into something, if the high draw rate is not explained by contempt and absolute strength.

1. stockfish_20100519 4 cores 62.0/100 +27 -3 =70
2. stockfish_20100519 1 core 38.0/100 +3 -27 =70

and (SF11 with 0 contempt)

1. stockfish_11 4 cores 70.0/100 +42 -2 =56
2. stockfish_11 1 core 30.0/100 +2 -42 =56

Too few (fast) games, but the draw rate is already significantly higher with NNUE with this small sample, compressing the Elo difference.
Last edited by Laskos on Sun Oct 11, 2020 8:49 pm, edited 1 time in total.
MMarco
Posts: 212
Joined: Sun Apr 12, 2020 1:09 am
Full name: Marc-O Moisan-Plante

Re: Cherry on Top.

Post by MMarco »

mwyoung wrote:

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7
Laskos wrote: Sun Oct 11, 2020 5:56 pm ... W/L is nuts in all you examples 8 vs 1 core.
I find it amazing too that SF NNUE 8 cores goes undefeated in 400 games! +75,=325,-0 (59.375%).

I got the same kind of results here, against an opponent that was almost on par and a higher win rate (66.5%), most likely due to the "low draws" openings:

100s + 1S, Ryzen 9 4900H, Hash=64 (1 core), Hash=256 (4 cores), Hash=512 (8 cores), syzygy 5-men. TCEC 9 SuFi openings (50 pos.):

Code: Select all

   # PLAYER                        :  RATING  ERROR  PLAYED    (%)   CFS    W    D    L   D(%)
   1 Stockfish 12 8CPU             :    99.4   29.0     100  66.50   100   33   67    0  67.00
   2 Stockfish 12                  :     0.0   25.0     100  53.50    95   16   75    9  75.00
   3 Ethereal 12.75 SFNNUE 4CPU    :   -25.6   16.2     200  40.00   ---    9  142   49  71.00

White advantage = 71.76 +/- 12.99
Draw rate (equal opponents) = 88.08 % +/- 4.74

Code: Select all

Score of Ethereal 12.75 SFNNUE vs Stockfish 12: 9 - 16 - 75 [0.465]
...      Ethereal 12.75 SFNNUE playing White: 8 - 5 - 37  [0.530] 50
...      Ethereal 12.75 SFNNUE playing Black: 1 - 11 - 38  [0.400] 50
...      White vs Black: 19 - 6 - 75  [0.565] 100
Elo difference: -24.4 +/- 33.9, LOS: 8.1 %, DrawRatio: 75.0 %
100 of 100 games finished.

Code: Select all

Score of Stockfish 12 8CPU vs Ethereal 12.75 SFNNUE 4CPU: 33 - 0 - 67 [0.665]
...      Stockfish 12 8CPU playing White: 29 - 0 - 21  [0.790] 50
...      Stockfish 12 8CPU playing Black: 4 - 0 - 46  [0.540] 50
...      White vs Black: 29 - 4 - 67  [0.625] 100
Elo difference: 119.1 +/- 36.1, LOS: 100.0 %, DrawRatio: 67.0 %
100 of 100 games finished.
Games: https://gofile.io/d/aPuOyz
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Cherry on Top.

Post by mwyoung »

Laskos wrote: Sun Oct 11, 2020 8:46 pm
mwyoung wrote: Sun Oct 11, 2020 7:20 pm
Laskos wrote: Sun Oct 11, 2020 6:56 pm
mwyoung wrote: Sun Oct 11, 2020 6:09 pm
Laskos wrote: Sun Oct 11, 2020 5:56 pm
mwyoung wrote: Sun Oct 11, 2020 5:40 pm
mwyoung wrote: Sat Oct 10, 2020 11:22 pm
mwyoung wrote: Sat Oct 10, 2020 7:23 am Lasko's Law----What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model.

It is clear to me that Stockfish NNUE does not obey Lasko's law as stated above. CCRL most likely does not have flawed testing.. And as suspected. The issues is with Stockfish NNUE. It took me many hours to testing to show this result, and the full results will be shown soon. When the testing is completed. The bottom line is the issue is with Stockfish NNUE, and not with CCRL testing. Full results coming soon. As you know testing can take days to answer this kind of anomaly, or false assumption.
All results were tested under the same conditions with a TC = 2m+1s. With the same book, and settings, with Perfect Book 2019. CPU was a 2950x with all cores locked to 4.1 Ghz.

Stockfish 11 with a classical evaluation obeys Lasko's Law. But assuming Stockfish 12 a hybrid with the new NN evaluation will also obey Stockfish's classical pattern was in error. Stockfish 12 does not obey Lasko's Law.

I tested two versions of Stockfish 12, version 12, and version 12 (051020). To make sure this behavior was not with just the original Stockfish 12.

Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo

Code: Select all

Result:
------------------------------------------------------------------------------------------------
  #  name                                games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 11 64 POPCNT dup 8 cores    200      81     118       1   140.0   100.0   147.2
  2. Stockfish 11 64 POPCNT dup 1 core     200       1     118      81    60.0     0.0  -147.2

Cross table:
------------------------------------------------------------------------------------------------
  #  name                                   score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 11 64 POPCNT dup 8 cores     140.0     200                                                                                                                                                                                                        x 111===1===111=1==1=1==1===11=1==11=====1====11=11==1=111==111====1111==11===1==11=========1===1====1111=111=1======1=1=1=0===1==1==1====11=11==11=11=1=11=1==1===1===1=11====11=====1==11=1==11==11==1==
  2. Stockfish 11 64 POPCNT dup 1 core       60.0     200 000===0===000=0==0=0==0===00=0==00=====0====00=00==0=000==000====0000==00===0==00=========0===0====0000=000=0======0=0=0=1===0==0==0====00=00==00=00=0=00=0==0===0===0=00====00=====0==00=0==00==00==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                                  nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 11 64 POPCNT dup 8 cores     35492K    13595007     31.7      2.6     61.0    159.1
  2. Stockfish 11 64 POPCNT dup 1 core       4551K     1662287     27.1      2.7     61.1    167.4
     all ---                                19530K     7478160     29.4      2.7     61.0    163.3

Tournament finished! Elapsed: 18:23:36

Code: Select all

Result:
------------------------------------------------------------------------------------------
  #  name                          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 8 cores    200      31     169       0   115.5   100.0    54.3
  2. Stockfish 051020 dup 1 core     200       0     169      31    84.5     0.0   -54.3

Cross table:
------------------------------------------------------------------------------------------
  #  name                             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 8 cores     115.5     200                                                                                                                                                                                                        x =1==1======1========111=====1=====1=================1====1======1===1==1========1=====================1=1======1==========1==1====1========1========1==1=====1============1===1==========1==11====1==1==
  2. Stockfish 051020 dup 1 core       84.5     200 =0==0======0========000=====0=====0=================0====0======0===0==0========0=====================0=0======0==========0==0====0========0========0==0=====0============0===0==========0==00====0==0==                                                                                                                                                                                                        x

Tech:
------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 8 cores     30556K    10784132     37.2      2.8     49.1    139.1
  2. Stockfish 051020 dup 1 core       3659K     1282020     29.7      2.9     49.2    140.4
     all ---                          16695K     6012180     33.4      2.8     49.1    139.8

Tournament finished! Elapsed: 15:50:53

Code: Select all

Result:
--------------------------------------------------------------------------------------
  #  name                      games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12 dup 8 cores    200      44     156       0   122.0   100.0    77.7
  2. Stockfish 12 dup 1 core     200       0     156      44    78.0     0.0   -77.7

Cross table:
--------------------------------------------------------------------------------------
  #  name                         score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12 dup 8 cores     122.0     200                                                                                                                                                                                                        x 1===1===============1======11======1==1===========1=1=====1===11===1===11==1===1========1=1=1==1=1========1===1=========1==1===========1===11======1====1====1==1====1=====1======1====111==1=11===1===1
  2. Stockfish 12 dup 1 core       78.0     200 0===0===============0======00======0==0===========0=0=====0===00===0===00==0===0========0=0=0==0=0========0===0=========0==0===========0===00======0====0====0==0====0=====0======0====000==0=00===0===0                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                        nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12 dup 8 cores     28475K     9905390     35.2      2.9     48.0    137.8
  2. Stockfish 12 dup 1 core       3474K     1180298     29.1      2.9     48.1    141.4
     all ---                      15585K     5486571     32.1      2.9     48.0    139.6

Tournament finished! Elapsed: 15:46:49
Stockfish 11 1 vs 8 cores +147.2 Elo
Stockfish 12 1 vs 8 cores +77.7 Elo
Stockfish 051020 1 vs 8 cores +54.3 Elo
Stockfish 051020 8 vs 16 cores +1.7 Elo

Code: Select all

Result:
-------------------------------------------------------------------------------------------
  #  name                           games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 051020 dup 16 cores    200       2     197       1   100.5    71.8     1.7
  2. Stockfish 051020 dup 8 cores     200       1     197       2    99.5    28.2    -1.7

Cross table:
-------------------------------------------------------------------------------------------
  #  name                              score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 051020 dup 16 cores     100.5     200                                                                                                                                                                                                        x ==============================================================0=1============================================================================1==========================================================
  2. Stockfish 051020 dup 8 cores       99.5     200 ==============================================================1=0============================================================================0==========================================================                                                                                                                                                                                                        x

Tech:
-------------------------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name                             nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 051020 dup 16 cores     57514K    20152492     41.9      2.9     49.5    141.2
  2. Stockfish 051020 dup 8 cores      29868K    10471507     39.0      2.9     49.5    141.2
     all ---                           42662K    15311939     40.4      2.9     49.5    141.2
All in all quickly glancing:

Openings, draw rate, contempt in SF11, weaker SF11 --- all combined are not ruled out as the main culprits here. There is no Laskos rule for 41:0 =159 results and even less for 2:1 =197 result. W/L is nuts in all you examples 8 vs 1 core. Sure, a worse multicore scaling of NNUE SF is quite possible here, but I guess one would need a bit clearer matches. Anyway, thanks for this long test.
I am all for more testing. Because "Houston, we've had a problem"

And I used your words exactly "Lasko's Law" on why you said that CCRL are... ""Yes, underperformance of 8CPU SF12 is statistically significant"-Lasko

and I asked WHY?

And then came Lasko's Law with no Data! Agreeing with the Flawed testing of CCRL. And this was your PROOF!

Lasko's Law----"What's not clear? 3 doublings in cores mean nowadays at least 2.5 real effective doublings in TC. Each effective doubling in TC in these blitz conditions means at very least 40 Elo points, therefore at very least 80 Elo points 1 core -> 8 cores. In fact more likely 120 - 140 Elo points. That result posted in OP and discrepancy beyond doubt break the Elo model." :lol:

As I told a member of CCRL about this thread....."As always on CCC, too much speculation and not enough data."
Yes, I agree with my statement "Yes, underperformance of 8CPU SF12 is statistically significant". For the rest with doublings, I used that as an estimate for CCRL blitz conditions and usual testing. When they populate the list with 8 cored engines (fairly tested) you will see what I am talking about.
We will see, but your PROOF has been busted. And for what every the cause of SF 12 and CCRL testing. I think we can now both agree "CCRL flawed testing : SF12 above SF12 8CPU" is clearly unfair. In the light of the data we have.

I think CCRL takes pride in their work. Agree or disagree with their methods of testing.

Or you would not see this by CCRL.

Modern Times-"It is doing my head in for sure."....."The somewhat unstructured and ad-hoc nature of our testing doesn't help in this situation either, although with enough games that usually eventually resolves itself. To get to the bottom of it you need to do some structured testing with exactly the same opponents, same hardware and testing conditions - which you and others have done or are doing."

It could be that SF NNUE scales badly to 8 cores, I haven't ruled that out. Maybe you are into something, if the high draw rate is not explained by contempt and absolute strength.

1. stockfish_20100519 4 cores 62.0/100 +27 -3 =70
2. stockfish_20100519 1 core 38.0/100 +3 -27 =70

and (SF11 with 0 contempt)

1. stockfish_11 4 cores 70.0/100 +42 -2 =56
2. stockfish_11 1 core 30.0/100 +2 -42 =56

Too few (fast) games, but the draw rate is already significantly higher with NNUE with this small sample, compressing the Elo difference.
"It could be that SF NNUE scales badly to 8 cores, I haven't ruled that out. Maybe you are into something, if the high draw rate is not explained by contempt and absolute strength."

I agree, and Stockfish 12 and above scales even worst with 8 cores to 16 cores! +1.7 Elo!!!

And that is why I assume nothing, and test what I think to be true.

I have been testing chess engines for 40 years.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: CCRL flawed testing : SF12 above SF12 8CPU

Post by Laskos »

mwyoung wrote: Sun Oct 11, 2020 9:00 pm

"It could be that SF NNUE scales badly to 8 cores, I haven't ruled that out. Maybe you are into something, if the high draw rate is not explained by contempt and absolute strength."

I agree, and Stockfish 12 and above scales even worst with 8 cores to 16 cores!

And that is why I assume nothing, and test what I think to be true.
I am not sure what I have assumed. I said that the result is statistically anomalous (not a fluke) and that the choice of opponents can break the Elo model. I didn't rule out anything, like bad scaling to multicore of SF12.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: CCRL flawed testing : SF12 above SF12 8CPU

Post by mwyoung »

Laskos wrote: Sun Oct 11, 2020 9:12 pm
mwyoung wrote: Sun Oct 11, 2020 9:00 pm

"It could be that SF NNUE scales badly to 8 cores, I haven't ruled that out. Maybe you are into something, if the high draw rate is not explained by contempt and absolute strength."

I agree, and Stockfish 12 and above scales even worst with 8 cores to 16 cores!

And that is why I assume nothing, and test what I think to be true.
I am not sure what I have assumed. I said that the result is statistically anomalous (not a fluke) and that the choice of opponents can break the Elo model. I didn't rule out anything, like bad scaling to multicore of SF12.
This is easy to answer. You have a great mind. And you are clearly very smart. But your bias and your
laziness clearly clouds your judgement. And I always read your comments, as it makes me a better critical thinker!

And I did notice that you chopped! My response in this thread!.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.