CCC has serious hardware update!

mwyoung · Post by **mwyoung** » Wed Dec 30, 2020 4:50 am

mwyoung wrote: ↑Wed Dec 30, 2020 3:25 am
AndrewGrant wrote: ↑Tue Dec 29, 2020 11:24 pm
mwyoung wrote: ↑Tue Dec 29, 2020 10:59 pm And it seems you need even less then 16 cores to play high quality chess with NNUE. Current standings with a 10x NPS advantage and a 2 core vs. 32 thread advantage. The Elo difference right now is only 35 ELO!! after 30 games. And this would be about the best case. Since Stockfish is only splitting the search into 2 threads.
Garbage results, which lead to poor understanding of how engines scale. Misinformation at best, deceit at worst.
Unless you are saying that you are giving the 2 cores 10x the thinking time? Which if you are, ignore my entire post.

Typically speaking, a core doubling from 1 to 2 starts at about +70 elo for an engine like Stockfish, using a balanced opening book. As you continue the doubling, that begins to drop. I am not sure exactly of the rate, but it appears logarithmic. However, there was a test with 192 cores vs 384 threads. Note that this is not a typical core doubling, but making use of the hyperthreads on a core. IE, it is not as good as doubling the number of cores. This resulted in about +20 elo after a few thousand games. This test is on fishtest somewhere, and is public record.

So you start your initial core doubling at +70 per doubling, and by the end a doubling is worth at least 20 elo. log2(256) = 8. Hyperthreads are being used. So it can be derived from this knowledge that 1 core SF vs 256 thread SF is roughly 7 * ((70 + 20) / 2) + 20, which is 300+ elo.

Will you see this over the board? Probably not. Elo differences at those extremes are poorly defined by the elo curve, which is well adjusted for similarly skilled opponents, but fails at the extremes due to the nature of the games it is employed in. The issue in this case being that SF is already so strong that, if there were a skill cap, SF is closer to it than any other entity. The result is more draws, which dampen your ability to exploit hundreds of elo advantages, and compresses the elo curve.
Your post is B.S. The data is here for all to see. And Just not from myself, but other testers. You just refuse to accept what is obvious. With NNUE you can ignore your scaling formula. Because it does not hold water.

You can trash me, CCRL, and others. Because you do not like the facts. But the facts are not going away with how NNUE scales.

Lets look not at my data, but CCRL blitz data. And lets see how NNUE behaves to a typical A/B engine like Ethereal.

1. Ethereal 12.75 64-bit 8CPU 3537 +20 −20 39.9% +61.4 57.2% 683
2. Ethereal 12.75 64-bit 3377 +22 −22 64.8% −86.2 56.5% 600

CCRL rating difference for 1 to 8 cores. 160 Elo

1. Stockfish 12 64-bit 8CPU 3692 +15 −14 71.5% −147.4 54.5% 1532
2. Stockfish 12 64-bit 3639 +15 −14 79.0% −226.6 36.1% 2054

CCRL rating difference from 1 to 8 cores. 53 Elo

"So you start your initial core doubling at +70 per doubling"

Chess Match Stockfish 280920 (2 cores, No TB) vs Dragon(32 Threads) (TC=3m+2s)

Code: Select all

DESKTOP-CORSAIR, Blitz 3.0min+2.0sec  0

                                         
1   Dragon by Komodo Chess 64-bit   +10  +5/=64/-3 51.39%   37.0/72
2   Stockfish 280920                -10  +3/=64/-5 48.61%   35.0/72

Milos · Post by **Milos** » Wed Dec 30, 2020 6:33 am

AndrewGrant wrote: ↑Tue Dec 29, 2020 11:24 pm
mwyoung wrote: ↑Tue Dec 29, 2020 10:59 pm And it seems you need even less then 16 cores to play high quality chess with NNUE. Current standings with a 10x NPS advantage and a 2 core vs. 32 thread advantage. The Elo difference right now is only 35 ELO!! after 30 games. And this would be about the best case. Since Stockfish is only splitting the search into 2 threads.
Garbage results, which lead to poor understanding of how engines scale. Misinformation at best, deceit at worst.
Unless you are saying that you are giving the 2 cores 10x the thinking time? Which if you are, ignore my entire post.

Typically speaking, a core doubling from 1 to 2 starts at about +70 elo for an engine like Stockfish, using a balanced opening book. As you continue the doubling, that begins to drop. I am not sure exactly of the rate, but it appears logarithmic. However, there was a test with 192 cores vs 384 threads. Note that this is not a typical core doubling, but making use of the hyperthreads on a core. IE, it is not as good as doubling the number of cores. This resulted in about +20 elo after a few thousand games. This test is on fishtest somewhere, and is public record.

So you start your initial core doubling at +70 per doubling, and by the end a doubling is worth at least 20 elo. log2(256) = 8. Hyperthreads are being used. So it can be derived from this knowledge that 1 core SF vs 256 thread SF is roughly 7 * ((70 + 20) / 2) + 20, which is 300+ elo.

Will you see this over the board? Probably not. Elo differences at those extremes are poorly defined by the elo curve, which is well adjusted for similarly skilled opponents, but fails at the extremes due to the nature of the games it is employed in. The issue in this case being that SF is already so strong that, if there were a skill cap, SF is closer to it than any other entity. The result is more draws, which dampen your ability to exploit hundreds of elo advantages, and compresses the elo curve.

That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...

mwyoung · Post by **mwyoung** » Wed Dec 30, 2020 7:21 am

Milos wrote: ↑Wed Dec 30, 2020 6:33 am
AndrewGrant wrote: ↑Tue Dec 29, 2020 11:24 pm
mwyoung wrote: ↑Tue Dec 29, 2020 10:59 pm And it seems you need even less then 16 cores to play high quality chess with NNUE. Current standings with a 10x NPS advantage and a 2 core vs. 32 thread advantage. The Elo difference right now is only 35 ELO!! after 30 games. And this would be about the best case. Since Stockfish is only splitting the search into 2 threads.
Garbage results, which lead to poor understanding of how engines scale. Misinformation at best, deceit at worst.
Unless you are saying that you are giving the 2 cores 10x the thinking time? Which if you are, ignore my entire post.

Typically speaking, a core doubling from 1 to 2 starts at about +70 elo for an engine like Stockfish, using a balanced opening book. As you continue the doubling, that begins to drop. I am not sure exactly of the rate, but it appears logarithmic. However, there was a test with 192 cores vs 384 threads. Note that this is not a typical core doubling, but making use of the hyperthreads on a core. IE, it is not as good as doubling the number of cores. This resulted in about +20 elo after a few thousand games. This test is on fishtest somewhere, and is public record.

So you start your initial core doubling at +70 per doubling, and by the end a doubling is worth at least 20 elo. log2(256) = 8. Hyperthreads are being used. So it can be derived from this knowledge that 1 core SF vs 256 thread SF is roughly 7 * ((70 + 20) / 2) + 20, which is 300+ elo.

Will you see this over the board? Probably not. Elo differences at those extremes are poorly defined by the elo curve, which is well adjusted for similarly skilled opponents, but fails at the extremes due to the nature of the games it is employed in. The issue in this case being that SF is already so strong that, if there were a skill cap, SF is closer to it than any other entity. The result is more draws, which dampen your ability to exploit hundreds of elo advantages, and compresses the elo curve.
That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...

And not that this is any comfort to the A/B engines. As NNUE crushes them even with the A/B engines having a massive hardware advantage.

Chess Match Stockfish 251220 (2 Cores) vs Stockfish 11(16 Cores) (TC=3m+2s)

Current standings +4 =11 -0 +95 Elo!! for NNUE 2 cores vs A/B 16 cores.

Live Stream:

AndrewGrant · Post by **AndrewGrant** » Wed Dec 30, 2020 8:27 am

mwyoung wrote: ↑Wed Dec 30, 2020 3:25 am You can trash me, CCRL, and others. Because you do not like the facts. But the facts are not going away with how NNUE scales.

Lets look not at my data, but CCRL blitz data. And lets see how NNUE behaves to a typical A/B engine like Ethereal.

1. Ethereal 12.75 64-bit 8CPU 3537 +20 −20 39.9% +61.4 57.2% 683
2. Ethereal 12.75 64-bit 3377 +22 −22 64.8% −86.2 56.5% 600

CCRL rating difference for 1 to 8 cores. 160 Elo

1. Stockfish 12 64-bit 8CPU 3692 +15 −14 71.5% −147.4 54.5% 1532
2. Stockfish 12 64-bit 3639 +15 −14 79.0% −226.6 36.1% 2054

CCRL rating difference from 1 to 8 cores. 53 Elo

"So you start your initial core doubling at +70 per doubling"

There were many threads addressing the issues with CCRLs methods and how it compresses elo and distorts values due to the opponent pool. If you have data that contradicts the typical scaling, then you should post it. You need thousands of games in your sample. Share the conditions, the opening book, and make sure to understand cores vs threads.

AndrewGrant · Post by **AndrewGrant** » Wed Dec 30, 2020 9:35 am

Milos wrote: ↑Wed Dec 30, 2020 6:33 am That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...

I disagree, and I'm running the games now to show it. Will make a posting of a few sets of 1 vs 2, 2 vs 4, 4 vs 8, 8 vs 16, for cores, and then again 16 cores vs 32 threads. 1,000 game samples for each. Will upload the PGNs, the opening book, and the testing conditions passed to cutechess. There is a misunderstanding on this forum and its fueled by people looking at small samples and other data sets without considering alternative explanations.

IanKennedy · Post by **IanKennedy** » Wed Dec 30, 2020 11:28 am

AndrewGrant wrote: ↑Wed Dec 30, 2020 9:35 am
Milos wrote: ↑Wed Dec 30, 2020 6:33 am That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...
I disagree, and I'm running the games now to show it. Will make a posting of a few sets of 1 vs 2, 2 vs 4, 4 vs 8, 8 vs 16, for cores, and then again 16 cores vs 32 threads. 1,000 game samples for each. Will upload the PGNs, the opening book, and the testing conditions passed to cutechess. There is a misunderstanding on this forum and its fueled by people looking at small samples and other data sets without considering alternative explanations.

With what engines/versions?

AndrewGrant · Post by **AndrewGrant** » Wed Dec 30, 2020 11:38 am

IanKennedy wrote: ↑Wed Dec 30, 2020 11:28 am
AndrewGrant wrote: ↑Wed Dec 30, 2020 9:35 am
Milos wrote: ↑Wed Dec 30, 2020 6:33 am That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...
I disagree, and I'm running the games now to show it. Will make a posting of a few sets of 1 vs 2, 2 vs 4, 4 vs 8, 8 vs 16, for cores, and then again 16 cores vs 32 threads. 1,000 game samples for each. Will upload the PGNs, the opening book, and the testing conditions passed to cutechess. There is a misunderstanding on this forum and its fueled by people looking at small samples and other data sets without considering alternative explanations.
With what engines/versions?

Ill do it with Stockfish and Ethereal NNUE. A taste, which is entirely as expected:
Ethereal-NNUE-2C vs Ethereal-NNUE-1C: 314 - 73 - 613 [0.621] 1000
Ethereal-NNUE-4C vs Ethereal-NNUE-2C: 290 - 74 - 636 [0.608] 1000

RubiChess · Post by **RubiChess** » Wed Dec 30, 2020 12:12 pm

AndrewGrant wrote: ↑Wed Dec 30, 2020 11:38 am
IanKennedy wrote: ↑Wed Dec 30, 2020 11:28 am
AndrewGrant wrote: ↑Wed Dec 30, 2020 9:35 am
Milos wrote: ↑Wed Dec 30, 2020 6:33 am That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...
I disagree, and I'm running the games now to show it. Will make a posting of a few sets of 1 vs 2, 2 vs 4, 4 vs 8, 8 vs 16, for cores, and then again 16 cores vs 32 threads. 1,000 game samples for each. Will upload the PGNs, the opening book, and the testing conditions passed to cutechess. There is a misunderstanding on this forum and its fueled by people looking at small samples and other data sets without considering alternative explanations.
With what engines/versions?
Ill do it with Stockfish and Ethereal NNUE. A taste, which is entirely as expected:
Ethereal-NNUE-2C vs Ethereal-NNUE-1C: 314 - 73 - 613 [0.621] 1000
Ethereal-NNUE-4C vs Ethereal-NNUE-2C: 290 - 74 - 636 [0.608] 1000

mwyoung will come back with the "use a serious time control" argument. And I will search for some popcorn...

mwyoung · Post by **mwyoung** » Wed Dec 30, 2020 12:27 pm

RubiChess wrote: ↑Wed Dec 30, 2020 12:12 pm
AndrewGrant wrote: ↑Wed Dec 30, 2020 11:38 am
IanKennedy wrote: ↑Wed Dec 30, 2020 11:28 am
AndrewGrant wrote: ↑Wed Dec 30, 2020 9:35 am
Milos wrote: ↑Wed Dec 30, 2020 6:33 am That's all true before NNUE. The endless threads about why LazySMP works on SF, widening trees, etc., etc.
Not the case any more. LazySMP scales abysmally with NNUE. Don't take my word for it, try it for yourself...
I disagree, and I'm running the games now to show it. Will make a posting of a few sets of 1 vs 2, 2 vs 4, 4 vs 8, 8 vs 16, for cores, and then again 16 cores vs 32 threads. 1,000 game samples for each. Will upload the PGNs, the opening book, and the testing conditions passed to cutechess. There is a misunderstanding on this forum and its fueled by people looking at small samples and other data sets without considering alternative explanations.
With what engines/versions?
Ill do it with Stockfish and Ethereal NNUE. A taste, which is entirely as expected:
Ethereal-NNUE-2C vs Ethereal-NNUE-1C: 314 - 73 - 613 [0.621] 1000
Ethereal-NNUE-4C vs Ethereal-NNUE-2C: 290 - 74 - 636 [0.608] 1000
mwyoung will come back with the "use a serious time control" argument. And I will search for some popcorn...

CCRL data only compressed NNUE engines. 53 Elo gain with 3 doublings of CPU cores.

And it is not a small sample size of games. Bo
th my testing and CCRL testing between SF NNUE, and Dragon NNUE have thousands of games. All showing the scaling issue with NNUE.

And mirco bullet is not even close to our testing. Conditions matter.

AndrewGrant · Post by **AndrewGrant** » Wed Dec 30, 2020 12:33 pm

RubiChess wrote: ↑Wed Dec 30, 2020 12:12 pm mwyoung will come back with the "use a serious time control" argument. And I will search for some popcorn...

Indeed. But, I warn them of doing that, for this has greater implications, which they do not yet understand ....

CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!

Re: CCC has serious hardware update!