Discussion of anything and everything relating to chess playing software and machines.
Moderators: hgm, Dann Corbit, Harvey Williamson
Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
-
Jouni
- Posts: 2328
- Joined: Wed Mar 08, 2006 7:15 pm
Post
by Jouni » Tue Dec 29, 2020 8:30 am
CPUs: 2 x AMD EPYC 7H12
GPU: 2x A100 (40 GB GPU memory)
Cores: 256 cores (128 physical)
RAM: 512GB DIMM DDR4 2933 MHz (0.3 ns)
SSD: 2x Micron 5210 MTFD (2TB) in RAID1
OS: CentOS 8
What's TCEC answer

?
Jouni
-
mwyoung
- Posts: 2725
- Joined: Wed May 12, 2010 8:00 pm
Post
by mwyoung » Tue Dec 29, 2020 9:06 am
Jouni wrote: ↑Tue Dec 29, 2020 8:30 am
CPUs: 2 x AMD EPYC 7H12
GPU: 2x A100 (40 GB GPU memory)
Cores: 256 cores (128 physical)
RAM: 512GB DIMM DDR4 2933 MHz (0.3 ns)
SSD: 2x Micron 5210 MTFD (2TB) in RAID1
OS: CentOS 8
What's TCEC answer

?
Why do they need to answer. Both have massive hardware. Will it change any results.
If you notice even on our much smaller hardware. The ranking of the engines does not really change.
And this is also what I am seeing in my testing.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
-
Ras
- Posts: 1686
- Joined: Tue Aug 30, 2016 6:19 pm
- Full name: Rasmus Althoff
-
Contact:
Post
by Ras » Tue Dec 29, 2020 9:38 am
Jouni wrote: ↑Tue Dec 29, 2020 8:30 am
OS: CentOS 8
That's probably not the best choice, given that the EOL date for CentOS 8 has just been preponed from 2029 to 2021, after which CentOS will become a rolling release testbed. CentOS 7 is still supported until 2024, provided that IBM/RedHat won't axe that, too.
-
brianr
- Posts: 496
- Joined: Thu Mar 09, 2006 2:01 pm
Post
by brianr » Tue Dec 29, 2020 11:14 am
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
-
Dann Corbit
- Posts: 11859
- Joined: Wed Mar 08, 2006 7:57 pm
- Location: Redmond, WA USA
-
Contact:
Post
by Dann Corbit » Tue Dec 29, 2020 11:47 am
brianr wrote: ↑Tue Dec 29, 2020 11:14 am
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
Tcec uses four of these CPUs:
https://ark.intel.com/content/www/us/en ... 0-ghz.html
Which run at 2.2GHz
Whereas the 7H12:
https://www.amd.com/en/products/cpu/amd-epyc-7h12
runs at 2.6 Ghz
Hence the CCC cores are 2.6/2.2 * 100 = 118% of the speed or 18% faster.
On the other hand, TCEC uses
4x V100 GPUs whereas CCC is using
2x A100 GPUs.
I do not know enough about the difference between the A100 and V100 to know which system has the advantage there.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
Raphexon
- Posts: 351
- Joined: Sun Mar 17, 2019 11:00 am
- Full name: Henk Drost
Post
by Raphexon » Tue Dec 29, 2020 11:52 am
Dann Corbit wrote: ↑Tue Dec 29, 2020 11:47 am
brianr wrote: ↑Tue Dec 29, 2020 11:14 am
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
Tcec uses four of these CPUs:
https://ark.intel.com/content/www/us/en ... 0-ghz.html
Which run at 2.2GHz
Whereas the 7H12:
https://www.amd.com/en/products/cpu/amd-epyc-7h12
runs at 2.6 Ghz
Hence the CCC cores are 2.6/2.2 * 100 = 118% of the speed or 18% faster.
On the other hand, TCEC uses
4x V100 GPUs whereas CCC is using
2x A100 GPUs.
I do not know enough about the difference between the A100 and V100 to know which system has the advantage there.
The 7H12 has far higher IPC than the TCEC Xeon.
It's not 18% faster, it's faster than that. (With the exception of PEXT because microcoded lol)
A single A100 is roughly twice as fast as a V100.
-
Dann Corbit
- Posts: 11859
- Joined: Wed Mar 08, 2006 7:57 pm
- Location: Redmond, WA USA
-
Contact:
Post
by Dann Corbit » Tue Dec 29, 2020 11:53 am
Actually, the CPU difference is smaller, since the GPU server uses its own CPU cores, which are:
CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz, 32 vcores
So they are :
2.5/2.6 * 100 = 96% of the speed or about 4% slower than the AMD cores.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
Dann Corbit
- Posts: 11859
- Joined: Wed Mar 08, 2006 7:57 pm
- Location: Redmond, WA USA
-
Contact:
Post
by Dann Corbit » Tue Dec 29, 2020 11:54 am
My conclusion is then, that the GPU systems are very nearly the same speed. Within a few percent perhaps.
On the other hand, I think that the CPU version on CCC is probably stronger.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
brianr
- Posts: 496
- Joined: Thu Mar 09, 2006 2:01 pm
Post
by brianr » Tue Dec 29, 2020 12:06 pm
The point is that on both CCC and TCEC the fast GPUs are being vastly underutilized by the extremely slow MHz CPUs.
-
Jouni
- Posts: 2328
- Joined: Wed Mar 08, 2006 7:15 pm
Post
by Jouni » Tue Dec 29, 2020 12:15 pm
Ipman bench:
278.098.432 2x AMD EPYC 7742 256threads
190.384.961 4x Intel Xeon E5-4669 v4 @2.20GHz
+46% my calculation says!
Jouni