Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Discussion of anything and everything relating to chess playing software and machines.

Moderators: Harvey Williamson, bob, hgm

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Madeleine Birchfield
Posts: 76
Joined: Tue Sep 29, 2020 2:29 pm
Location: Dublin, Ireland
Full name: Madeleine Birchfield

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Madeleine Birchfield » Wed Sep 30, 2020 1:40 pm

Vinvin wrote:
Wed Sep 30, 2020 9:51 am
"As chess is drawish by nature."
Note 1 : this is a belief not a proven fact. We saw this kind of allegation in different times :
- In the years '70s, when Russian GMs made more and more draws.
- When Rybka (around year 2010) was dominating the chess scene, the number of draws was rising so much that many people said that we reached a limit where only draws was possible.
- and so on ...
But now top engines are 500 Elo above Rybka and 1000 Elo above the level of play of '70s
Chess is drawish at the highest level with very long time controls. See correspondence chess.

Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.

Chessqueen
Posts: 1078
Joined: Wed Sep 05, 2018 12:16 am
Full name: Nancy M Pichardo

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Chessqueen » Mon Oct 05, 2020 3:34 pm

Laskos wrote:
Wed Sep 02, 2020 8:42 am
DrCliche wrote:
Wed Sep 02, 2020 6:46 am
Sure enough that three draws (lol) at TCEC shouldn't measurably change our extreme confidence, yes.
Just a different worldview and perceptions. I guess the priors of both of you have a similar mean but very different widths. Both of you expected SF NNUE to be stronger than SF Classic by a similar margin, say 124 Elo points, but with different emotional and mental attitude about your confidence in that.

I pictured the priors here:

Image

After these 3 TCEC draws, the posterior estimate of the difference between SF NNUE and SF Classic in his case dropped to 32 Elo points from a priori 124 Elo points, in your case dropped to 113 Elo points from the same 124 a priori Elo points. So he is legit in asking "Are we sure that Stockfish NNUE is better than the Normal Stockfish ?" and you are right in replying "Sure enough that three draws (lol) at TCEC shouldn't measurably change our extreme confidence, yes."

:twisted:
Can you post a better photo of you, Probably you can find a better photo of you when were a child :lol:

Chessqueen
Posts: 1078
Joined: Wed Sep 05, 2018 12:16 am
Full name: Nancy M Pichardo

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Chessqueen » Mon Oct 05, 2020 3:37 pm

Madeleine Birchfield wrote:
Wed Sep 30, 2020 1:40 pm
Vinvin wrote:
Wed Sep 30, 2020 9:51 am
"As chess is drawish by nature."
Note 1 : this is a belief not a proven fact. We saw this kind of allegation in different times :
- In the years '70s, when Russian GMs made more and more draws.
- When Rybka (around year 2010) was dominating the chess scene, the number of draws was rising so much that many people said that we reached a limit where only draws was possible.
- and so on ...
But now top engines are 500 Elo above Rybka and 1000 Elo above the level of play of '70s
Chess is drawish at the highest level with very long time controls. See correspondence chess.

Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.
Stockfish NNUE is NOT really dominating LCZero :shock: ==> https://tcec-chess.com/live.html

mwyoung
Posts: 2433
Joined: Wed May 12, 2010 8:00 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by mwyoung » Mon Oct 05, 2020 3:49 pm

mwyoung wrote:
Tue Sep 08, 2020 3:29 pm
Milos wrote:
Tue Sep 08, 2020 1:10 pm
mwyoung wrote:
Sat Sep 05, 2020 3:49 am
Chessqueen wrote:
Sat Sep 05, 2020 3:21 am
Dann Corbit wrote:
Wed Sep 02, 2020 11:55 pm
Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever :roll:
https://tcec-chess.com/live.html
I agree. I just played 200 games with Stockfish 12 Vs Lc0 26.2. Stockfish 12 won by only 24 Elo in 200 games at 3m+2s. And in testing. We can see how badly Stockfish NNUE has scaled in past testing. At longer time controls.

Both are the best chess engines, and the winner may only be decided by hardware and time controls.

The sprinter Stockfish 12 vs. the marathon runner Lc0. Who wins the race. May depend on the distance of the race!


Lc0 is clearly improving faster then Stockfish at this point in time. Even at 3m+2s time controls vs past matches at the same time controls.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12    200      16     182       2   107.0   100.0    24.4
  2. Lc0 v0.26.2     200       2     182      16    93.0     0.0   -24.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12     107.0     200                                                                                                                                                                                                        x =====1==1===1====================1======1========11========================1==========================1============================================1====1===1=========1==0================1===1=1==0====
  2. Lc0 v0.26.2       93.0     200 =====0==0===0====================0======0========00========================0==========================0============================================0====0===0=========0==1================0===0=0==1====                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12    125173K    26565996     42.5      4.7     54.1    255.1
  2. Lc0 v0.26.2        101K       20342     10.0      4.9     54.1    267.2
     all ---          61216K    12984844     26.3      4.8     54.1    261.2
For years ppl come up with the BS theory that A/B engines tuned in micro-bullet would be weak in LTC and for years they are so bluntly proven wrong. Impact of eval on horizon effects is minimal and it doesn't change whether you search to depth 20 or depth 100. SF-NN search is SF and SF is proven to scale better than Lc0 (and as a matter of fact any MCTS engine) in LTC. Ergo SF-NN scales better than Lc0 in LTC.
Your claims are simply BS reflecting your cluelessness in the matter. You effectively draw conclusions from STC (just because it's not micro-bullet but blitz instead) with a sample size that is a joke.
The result in the superfinal will be much worse sweep than last year. And then ppl like you would be astonished and would come up with all kind of ridiculous excuses to justify what is basically their cluelessness.
The only one that is clueless here is you. As I test at the longer time controls, as well as short time controls. Along with 1 core testing, and up to 32 threads.

And I am not talking about A/B engine only testing at micro-bullet. And I never have. I am talking about NNUE! And my sample size is huge. This is not my only test. I test non stop.

My conclusion is what the data is showing us, and if it changes all will see that also. I test openly, and to video.


"SF-NN search is SF and SF is proven to scale better than Lc0" :lol:
Milos Lc0 looks good so far as expected.
Professing themselves to be wise, they became fools,
take on me. Foes 0.

mwyoung
Posts: 2433
Joined: Wed May 12, 2010 8:00 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by mwyoung » Mon Oct 05, 2020 3:56 pm

mwyoung wrote:
Mon Oct 05, 2020 3:49 pm
mwyoung wrote:
Tue Sep 08, 2020 3:29 pm
Milos wrote:
Tue Sep 08, 2020 1:10 pm
mwyoung wrote:
Sat Sep 05, 2020 3:49 am
Chessqueen wrote:
Sat Sep 05, 2020 3:21 am
Dann Corbit wrote:
Wed Sep 02, 2020 11:55 pm
Stockfish nnue has a secret weapon. The Kamehameha blast. Of course, he has to go to level 5 before he can use it. You don't just go Kamehameha blasting stuff willy-nilly.
At the very end it will be LCZero Vs Stockfish NNUE, but I predict a very close encounter of the 3rd kind, LCZero from Planet 1140b Vs StockFish NNUE from Planet Earth, Now I am more convinced than ever :roll:
https://tcec-chess.com/live.html
I agree. I just played 200 games with Stockfish 12 Vs Lc0 26.2. Stockfish 12 won by only 24 Elo in 200 games at 3m+2s. And in testing. We can see how badly Stockfish NNUE has scaled in past testing. At longer time controls.

Both are the best chess engines, and the winner may only be decided by hardware and time controls.

The sprinter Stockfish 12 vs. the marathon runner Lc0. Who wins the race. May depend on the distance of the race!


Lc0 is clearly improving faster then Stockfish at this point in time. Even at 3m+2s time controls vs past matches at the same time controls.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12    200      16     182       2   107.0   100.0    24.4
  2. Lc0 v0.26.2     200       2     182      16    93.0     0.0   -24.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                                                                                                                                                                                        1                                                                                                                                                                                                        2
  1. Stockfish 12     107.0     200                                                                                                                                                                                                        x =====1==1===1====================1======1========11========================1==========================1============================================1====1===1=========1==0================1===1=1==0====
  2. Lc0 v0.26.2       93.0     200 =====0==0===0====================0======0========00========================0==========================0============================================0====0===0=========0==1================0===0=0==1====                                                                                                                                                                                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time
  1. Stockfish 12    125173K    26565996     42.5      4.7     54.1    255.1
  2. Lc0 v0.26.2        101K       20342     10.0      4.9     54.1    267.2
     all ---          61216K    12984844     26.3      4.8     54.1    261.2
For years ppl come up with the BS theory that A/B engines tuned in micro-bullet would be weak in LTC and for years they are so bluntly proven wrong. Impact of eval on horizon effects is minimal and it doesn't change whether you search to depth 20 or depth 100. SF-NN search is SF and SF is proven to scale better than Lc0 (and as a matter of fact any MCTS engine) in LTC. Ergo SF-NN scales better than Lc0 in LTC.
Your claims are simply BS reflecting your cluelessness in the matter. You effectively draw conclusions from STC (just because it's not micro-bullet but blitz instead) with a sample size that is a joke.
The result in the superfinal will be much worse sweep than last year. And then ppl like you would be astonished and would come up with all kind of ridiculous excuses to justify what is basically their cluelessness.
The only one that is clueless here is you. As I test at the longer time controls, as well as short time controls. Along with 1 core testing, and up to 32 threads.

And I am not talking about A/B engine only testing at micro-bullet. And I never have. I am talking about NNUE! And my sample size is huge. This is not my only test. I test non stop.

My conclusion is what the data is showing us, and if it changes all will see that also. I test openly, and to video.


"SF-NN search is SF and SF is proven to scale better than Lc0" :lol:
Milos Lc0 looks good so far as expected.
Lc0 takes the lead after 33 games. :shock:
Professing themselves to be wise, they became fools,
take on me. Foes 0.

Jouni
Posts: 2227
Joined: Wed Mar 08, 2006 7:15 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Jouni » Mon Oct 05, 2020 4:26 pm

Funny ending where SF sees it loses much before Lc0 that it wins! In blitz SF should switch to classic totally?
Jouni

Cornfed
Posts: 99
Joined: Sun Apr 26, 2020 9:40 pm
Full name: Brian D. Smith

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Cornfed » Mon Oct 05, 2020 9:47 pm

mwyoung wrote:
Mon Oct 05, 2020 3:56 pm


Lc0 takes the lead after 33 games. :shock:
And as quick and the blink of an eye, order is restores....see game 34. 8-)

mwyoung
Posts: 2433
Joined: Wed May 12, 2010 8:00 pm

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by mwyoung » Mon Oct 05, 2020 11:36 pm

Cornfed wrote:
Mon Oct 05, 2020 9:47 pm
mwyoung wrote:
Mon Oct 05, 2020 3:56 pm


Lc0 takes the lead after 33 games. :shock:
And as quick and the blink of an eye, order is restores....see game 34. 8-)
I know. But that is not the point. Milos said SF would crush Lc0. Worse then last year. My testing said no.

Stockfish and Lc0 despite the hype are very close in strength at LTC.

And the reason for this is TCEC uses bias openings. If a match shows a win for both engines in the same line. It is a busted opening. But we already know from TCEC own words. Wins are more important for views. So this is expected.
Professing themselves to be wise, they became fools,
take on me. Foes 0.

Nay Lin Tun
Posts: 633
Joined: Mon Jan 16, 2012 5:34 am

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by Nay Lin Tun » Tue Oct 06, 2020 1:47 am

mwyoung wrote:
Mon Oct 05, 2020 11:36 pm
Cornfed wrote:
Mon Oct 05, 2020 9:47 pm
mwyoung wrote:
Mon Oct 05, 2020 3:56 pm


Lc0 takes the lead after 33 games. :shock:
And as quick and the blink of an eye, order is restores....see game 34. 8-)
I know. But that is not the point. Milos said SF would crush Lc0. Worse then last year. My testing said no.

Stockfish and Lc0 despite the hype are very close in strength at LTC.

And the reason for this is TCEC uses bias openings. If a match shows a win for both engines in the same line. It is a busted opening. But we already know from TCEC own words. Wins are more important for views. So this is expected.
It become pretty Normal these days.

Public can easily access Stockfish progress from fishtest and regression tests.

Meanwhile, Leela progress is available on Leela discord only and hard to interpret results from various individual testers.

" If you believe SF had +100 elo recently, you have to believe Leela had +100 elo recently".

In fact both are not true.

corres
Posts: 3482
Joined: Wed Nov 18, 2015 10:41 am
Location: hungary

Re: Are we sure that Stockfish NNUE is better than the Normal Stockfish ?

Post by corres » Tue Oct 06, 2020 9:06 am

Madeleine Birchfield wrote:
Wed Sep 30, 2020 1:40 pm
...
Chess is drawish at the highest level with very long time controls. See correspondence chess.
Chess in fast time controls is not drawish at all, filled with many wins and losses. So is chess with weak players, or chess with a huge strength differential between the players.
I totally agree you.

Post Reply