Stockfish 12 dominating CCRL

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 12 dominating CCRL

Post by mwyoung »

Raphexon wrote: Mon Sep 07, 2020 10:45 pm
mwyoung wrote: Mon Sep 07, 2020 9:48 pm
abgursu wrote: Mon Sep 07, 2020 9:29 pm
mwyoung wrote: Mon Sep 07, 2020 8:51 pm
Laskos wrote: Mon Sep 07, 2020 8:48 pm
mwyoung wrote: Mon Sep 07, 2020 8:45 pm
Uri Blass wrote: Mon Sep 07, 2020 8:40 pm
mwyoung wrote: Mon Sep 07, 2020 8:19 pm
RogerC wrote: Mon Sep 07, 2020 8:12 pm
Modern Times wrote: Mon Sep 07, 2020 7:09 pm
Laskos wrote: Mon Sep 07, 2020 11:29 am Thanks, so it seems RTX2060 is a good match for 1 fast thread. Still, it's close, RTX2080 is not even 2 times faster. CCRL seems to show 50 Elo advantage of SF 12 over Lc0 1541 2080, which I guess are either $$$ or an artifact of the rating list with many much weaker opponents.
SF12 has not actually played any of the RTX2080 engines head-to-head (different tester has the RTX) so we need to take care with that initial rating. I think things could change a little once those games happen.
As i said, Stockfish 12 is dominating CCRL , but CEGT also !
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
Here are the results of the matchs between SF12 and opponents : SF12 (1 CPU) vs LCZero 0.25.1 Cuda (LS15.0) : 100 games , 11 wins, 86 draws, 3 losses.
http://www.cegt.net/40_40%20Rating%20Li ... ons/1.html

SF12 on 8 cores will crush LC0... LC0 is the past, and is finished because it needs more and more Graphical power (2000+ Cores) to compute and evaluate half dozen kn/s. On the other hand SF do an intelligent mix of classical and NNUE evaluations on just a single CPU Core. And finally, NNUE nets are just at the very beginning and early of there trainings (2 month old). See you in 12 months !
Stockfish 12 is the best engine, but still scales badly. The best engine in most TC, but crushing! Still not so with Lc0....

Be careful!

This is just the latest test.

Live Stockfish 12 vs Lc0 26.2 (NN-J92-115), (32 Threads), (TC=15m+15s), (40 Games)

Hardware 2950x, RTX 2080 Ti

Stockfish 12 Default NN
Lc0 26.2 (NN-J92-115)
Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and the top ten 7 man TB.
Opening book 6 moves.
Default settings.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12     40       2      38       0    21.0    92.1    17.4
  2. Lc0 v0.26.2      40       0      38       2    19.0     7.9   -17.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                        1                                        2
  1. Stockfish 12      21.0      40                                        x =========1==========1===================
  2. Lc0 v0.26.2       19.0      40 =========0==========0===================                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time   #fails
  1. Stockfish 12    715650K    25949227     59.1     27.6     58.8   1622.3         
  2. Lc0 v0.26.2        847K       29221     11.6     29.0     58.8   1704.3        1
     all ---         349926K    12669925     35.3     28.3     58.8   1663.3        1
The question is if it is the fault of stockfish.

Maybe it is simply impossible to win at long time control against top engines.
I do not see stockfish losing in your results.

Scales badly means for me something like the following:
Stockfish get 50% at 5 seconds per move against 10 seconds per move of LC0 but get less than 50% at 50 seconds per move against 100 seconds per move of LC0

Do you have some evidence for this(maybe with different time controls but the idea is clear)?
This is a possible, and if true means the death of chess engine development. Because if chess is dead at 15m+15s on a 32 thread system. Logically you know what the means in a very short time...
Luckily TCEC and other LTC tournaments don't use your openings.
Yes, but most testing sites do. Moron!

And if chess is dead at 6 moves. With the chess engines playing most of the opening. We have a much bigger problem!

And it is not my opening book! But the best testing book I have found. Sedatchess Perfect book 2019. Yell at him for the openings.

https://sites.google.com/site/computers ... t2019books
What about Drawkillers?
Yes, but Drawkillers are A.K.A inferior openings. Remember 1 win for white, and 1 win for white with the same opening with 2 different engines with equal strength is the same as 1/2 and 1/2. But I guess some like TCEC have a motivation for wins. I do not.... I will not force a engine to play an inferior opening for wins, for the sake of wins....

Remember the book is only to 6 moves....and the engine plays the rest of the opening.
Have you ever seen the drawkiller opening set?

They're not guaranteed wins, they're extremely complex that can lead to both black and white wins.
When I did my 5000 game test white barely had an advantage over black.

The moves to get into that position are extremely dubious, like extremely and barely resemble chess.
But the end position is balanced.

I actually did a 5000 game match between SF12 and SF11 at 60+0.6
Results were +2862 -356 =1782 for SF12
Elo 191.4 +/- 8.0

White vs black score was +1711 -1507 =1782 (52% score for white)
So the drawkiller opening set is actually more balanced than regular opening positions.


But your assumption of the drawkiller set is wrong. They don't lead to forced wins.
You are assuming a lot here. I will just quote you!

"The moves to get into that position are extremely dubious"

I agree!
Last edited by mwyoung on Mon Sep 07, 2020 10:53 pm, edited 1 time in total.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Stockfish 12 dominating CCRL

Post by Raphexon »

mwyoung wrote: Mon Sep 07, 2020 10:49 pm
Raphexon wrote: Mon Sep 07, 2020 10:45 pm
mwyoung wrote: Mon Sep 07, 2020 9:48 pm
abgursu wrote: Mon Sep 07, 2020 9:29 pm
mwyoung wrote: Mon Sep 07, 2020 8:51 pm
Laskos wrote: Mon Sep 07, 2020 8:48 pm
mwyoung wrote: Mon Sep 07, 2020 8:45 pm
Uri Blass wrote: Mon Sep 07, 2020 8:40 pm
mwyoung wrote: Mon Sep 07, 2020 8:19 pm
RogerC wrote: Mon Sep 07, 2020 8:12 pm
Modern Times wrote: Mon Sep 07, 2020 7:09 pm
Laskos wrote: Mon Sep 07, 2020 11:29 am Thanks, so it seems RTX2060 is a good match for 1 fast thread. Still, it's close, RTX2080 is not even 2 times faster. CCRL seems to show 50 Elo advantage of SF 12 over Lc0 1541 2080, which I guess are either $$$ or an artifact of the rating list with many much weaker opponents.
SF12 has not actually played any of the RTX2080 engines head-to-head (different tester has the RTX) so we need to take care with that initial rating. I think things could change a little once those games happen.
As i said, Stockfish 12 is dominating CCRL , but CEGT also !
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
Here are the results of the matchs between SF12 and opponents : SF12 (1 CPU) vs LCZero 0.25.1 Cuda (LS15.0) : 100 games , 11 wins, 86 draws, 3 losses.
http://www.cegt.net/40_40%20Rating%20Li ... ons/1.html

SF12 on 8 cores will crush LC0... LC0 is the past, and is finished because it needs more and more Graphical power (2000+ Cores) to compute and evaluate half dozen kn/s. On the other hand SF do an intelligent mix of classical and NNUE evaluations on just a single CPU Core. And finally, NNUE nets are just at the very beginning and early of there trainings (2 month old). See you in 12 months !
Stockfish 12 is the best engine, but still scales badly. The best engine in most TC, but crushing! Still not so with Lc0....

Be careful!

This is just the latest test.

Live Stockfish 12 vs Lc0 26.2 (NN-J92-115), (32 Threads), (TC=15m+15s), (40 Games)

Hardware 2950x, RTX 2080 Ti

Stockfish 12 Default NN
Lc0 26.2 (NN-J92-115)
Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and the top ten 7 man TB.
Opening book 6 moves.
Default settings.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12     40       2      38       0    21.0    92.1    17.4
  2. Lc0 v0.26.2      40       0      38       2    19.0     7.9   -17.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                        1                                        2
  1. Stockfish 12      21.0      40                                        x =========1==========1===================
  2. Lc0 v0.26.2       19.0      40 =========0==========0===================                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time   #fails
  1. Stockfish 12    715650K    25949227     59.1     27.6     58.8   1622.3         
  2. Lc0 v0.26.2        847K       29221     11.6     29.0     58.8   1704.3        1
     all ---         349926K    12669925     35.3     28.3     58.8   1663.3        1
The question is if it is the fault of stockfish.

Maybe it is simply impossible to win at long time control against top engines.
I do not see stockfish losing in your results.

Scales badly means for me something like the following:
Stockfish get 50% at 5 seconds per move against 10 seconds per move of LC0 but get less than 50% at 50 seconds per move against 100 seconds per move of LC0

Do you have some evidence for this(maybe with different time controls but the idea is clear)?
This is a possible, and if true means the death of chess engine development. Because if chess is dead at 15m+15s on a 32 thread system. Logically you know what the means in a very short time...
Luckily TCEC and other LTC tournaments don't use your openings.
Yes, but most testing sites do. Moron!

And if chess is dead at 6 moves. With the chess engines playing most of the opening. We have a much bigger problem!

And it is not my opening book! But the best testing book I have found. Sedatchess Perfect book 2019. Yell at him for the openings.

https://sites.google.com/site/computers ... t2019books
What about Drawkillers?
Yes, but Drawkillers are A.K.A inferior openings. Remember 1 win for white, and 1 win for white with the same opening with 2 different engines with equal strength is the same as 1/2 and 1/2. But I guess some like TCEC have a motivation for wins. I do not.... I will not force a engine to play an inferior opening for wins, for the sake of wins....

Remember the book is only to 6 moves....and the engine plays the rest of the opening.
Have you ever seen the drawkiller opening set?

They're not guaranteed wins, they're extremely complex that can lead to both black and white wins.
When I did my 5000 game test white barely had an advantage over black.

The moves to get into that position are extremely dubious, like extremely and barely resemble chess.
But the end position is balanced.

I actually did a 5000 game match between SF12 and SF11 at 60+0.6
Results were +2862 -356 =1782 for SF12
Elo 191.4 +/- 8.0

White vs black score was +1711 -1507 =1782 (52% score for white)
So the drawkiller opening set is actually more balanced than regular opening positions.


But your assumption of the drawkiller set is wrong. They don't lead to forced wins.
You are assuming a lot here. I will just quote you!

"The moves to get into that position are extremely dubious"
I'm not assuming:
"Remember 1 win for white, and 1 win for white with the same opening with 2 different engines with equal strength is the same as 1/2 and 1/2."

This is what you said.
That assumption is wrong because Drawkiller openings don't play out like that.
Raphexon
Posts: 476
Joined: Sun Mar 17, 2019 12:00 pm
Full name: Henk Drost

Re: Stockfish 12 dominating CCRL

Post by Raphexon »

Also you don't have to do a nuclear option like Stockfish, but the current book you're using is extremely drawish.

Far too drawish to be (imo) enjoyable to watch.
mwyoung
Posts: 2727
Joined: Wed May 12, 2010 10:00 pm

Re: Stockfish 12 dominating CCRL

Post by mwyoung »

Raphexon wrote: Mon Sep 07, 2020 10:53 pm
mwyoung wrote: Mon Sep 07, 2020 10:49 pm
Raphexon wrote: Mon Sep 07, 2020 10:45 pm
mwyoung wrote: Mon Sep 07, 2020 9:48 pm
abgursu wrote: Mon Sep 07, 2020 9:29 pm
mwyoung wrote: Mon Sep 07, 2020 8:51 pm
Laskos wrote: Mon Sep 07, 2020 8:48 pm
mwyoung wrote: Mon Sep 07, 2020 8:45 pm
Uri Blass wrote: Mon Sep 07, 2020 8:40 pm
mwyoung wrote: Mon Sep 07, 2020 8:19 pm
RogerC wrote: Mon Sep 07, 2020 8:12 pm
Modern Times wrote: Mon Sep 07, 2020 7:09 pm
Laskos wrote: Mon Sep 07, 2020 11:29 am Thanks, so it seems RTX2060 is a good match for 1 fast thread. Still, it's close, RTX2080 is not even 2 times faster. CCRL seems to show 50 Elo advantage of SF 12 over Lc0 1541 2080, which I guess are either $$$ or an artifact of the rating list with many much weaker opponents.
SF12 has not actually played any of the RTX2080 engines head-to-head (different tester has the RTX) so we need to take care with that initial rating. I think things could change a little once those games happen.
As i said, Stockfish 12 is dominating CCRL , but CEGT also !
http://www.cegt.net/40_40%20Rating%20Li ... liste.html
Here are the results of the matchs between SF12 and opponents : SF12 (1 CPU) vs LCZero 0.25.1 Cuda (LS15.0) : 100 games , 11 wins, 86 draws, 3 losses.
http://www.cegt.net/40_40%20Rating%20Li ... ons/1.html

SF12 on 8 cores will crush LC0... LC0 is the past, and is finished because it needs more and more Graphical power (2000+ Cores) to compute and evaluate half dozen kn/s. On the other hand SF do an intelligent mix of classical and NNUE evaluations on just a single CPU Core. And finally, NNUE nets are just at the very beginning and early of there trainings (2 month old). See you in 12 months !
Stockfish 12 is the best engine, but still scales badly. The best engine in most TC, but crushing! Still not so with Lc0....

Be careful!

This is just the latest test.

Live Stockfish 12 vs Lc0 26.2 (NN-J92-115), (32 Threads), (TC=15m+15s), (40 Games)

Hardware 2950x, RTX 2080 Ti

Stockfish 12 Default NN
Lc0 26.2 (NN-J92-115)
Ponder off.
TC=15m+15s
32 threads.
4 Gb hash.
6 man TB, and the top ten 7 man TB.
Opening book 6 moves.
Default settings.

Code: Select all

Result:
--------------------------------------------------------------------------
  #  name          games    wins   draws  losses   score    los%  elo+/-
  1. Stockfish 12     40       2      38       0    21.0    92.1    17.4
  2. Lc0 v0.26.2      40       0      38       2    19.0     7.9   -17.4

Cross table:
--------------------------------------------------------------------------
  #  name             score   games                                        1                                        2
  1. Stockfish 12      21.0      40                                        x =========1==========1===================
  2. Lc0 v0.26.2       19.0      40 =========0==========0===================                                        x

Tech:
--------------------------------------------------------------------------

Tech (average nodes, depths, time/m per move, others per game), counted for computing moves only, ignored moves with zero nodes:
  #  name            nodes/m         NPS  depth/m   time/m    moves     time   #fails
  1. Stockfish 12    715650K    25949227     59.1     27.6     58.8   1622.3         
  2. Lc0 v0.26.2        847K       29221     11.6     29.0     58.8   1704.3        1
     all ---         349926K    12669925     35.3     28.3     58.8   1663.3        1
The question is if it is the fault of stockfish.

Maybe it is simply impossible to win at long time control against top engines.
I do not see stockfish losing in your results.

Scales badly means for me something like the following:
Stockfish get 50% at 5 seconds per move against 10 seconds per move of LC0 but get less than 50% at 50 seconds per move against 100 seconds per move of LC0

Do you have some evidence for this(maybe with different time controls but the idea is clear)?
This is a possible, and if true means the death of chess engine development. Because if chess is dead at 15m+15s on a 32 thread system. Logically you know what the means in a very short time...
Luckily TCEC and other LTC tournaments don't use your openings.
Yes, but most testing sites do. Moron!

And if chess is dead at 6 moves. With the chess engines playing most of the opening. We have a much bigger problem!

And it is not my opening book! But the best testing book I have found. Sedatchess Perfect book 2019. Yell at him for the openings.

https://sites.google.com/site/computers ... t2019books
What about Drawkillers?
Yes, but Drawkillers are A.K.A inferior openings. Remember 1 win for white, and 1 win for white with the same opening with 2 different engines with equal strength is the same as 1/2 and 1/2. But I guess some like TCEC have a motivation for wins. I do not.... I will not force a engine to play an inferior opening for wins, for the sake of wins....

Remember the book is only to 6 moves....and the engine plays the rest of the opening.
Have you ever seen the drawkiller opening set?

They're not guaranteed wins, they're extremely complex that can lead to both black and white wins.
When I did my 5000 game test white barely had an advantage over black.

The moves to get into that position are extremely dubious, like extremely and barely resemble chess.
But the end position is balanced.

I actually did a 5000 game match between SF12 and SF11 at 60+0.6
Results were +2862 -356 =1782 for SF12
Elo 191.4 +/- 8.0

White vs black score was +1711 -1507 =1782 (52% score for white)
So the drawkiller opening set is actually more balanced than regular opening positions.


But your assumption of the drawkiller set is wrong. They don't lead to forced wins.
You are assuming a lot here. I will just quote you!

"The moves to get into that position are extremely dubious"
I'm not assuming:
"Remember 1 win for white, and 1 win for white with the same opening with 2 different engines with equal strength is the same as 1/2 and 1/2."

This is what you said.
That assumption is wrong because Drawkiller openings don't play out like that.
Not usually. Depending the strength of the 2 chess engines. If I play Sargon vs Stockfish 12. Or Stockfish 12 vs a much less weaker engine. Stockfish 12 will win in all the games, but it has nothing to do with the openings. Just an example.

The objective is it give fair openings....

But why should I play know inferior openings, Just to force a win like TCEC. I will not do this in my testing knowingly.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.