Why do they need to answer. Both have massive hardware. Will it change any results.
If you notice even on our much smaller hardware. The ranking of the engines does not really change.
And this is also what I am seeing in my testing.
"The worst thing that can happen to a forum is a running wild attacking moderator(HGM) who is not corrected by the community." - Ed Schröder
But my words like silent raindrops fell. And echoed in the wells of silence.
Jouni wrote: ↑Tue Dec 29, 2020 9:30 amOS: CentOS 8
That's probably not the best choice, given that the EOL date for CentOS 8 has just been preponed from 2029 to 2021, after which CentOS will become a rolling release testbed. CentOS 7 is still supported until 2024, provided that IBM/RedHat won't axe that, too.
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
brianr wrote: ↑Tue Dec 29, 2020 12:14 pm
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
Hence the CCC cores are 2.6/2.2 * 100 = 118% of the speed or 18% faster.
On the other hand, TCEC uses 4x V100 GPUs whereas CCC is using 2x A100 GPUs.
I do not know enough about the difference between the A100 and V100 to know which system has the advantage there.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
brianr wrote: ↑Tue Dec 29, 2020 12:14 pm
As with the TCEC rig, the GPUs are very nice, but unfortunately the slow CPUs are a fairly severe handicap for GPU engines like Lc0, which do best with a small number of very fast CPUs. Of course, one could say that it is up to Lc0 to improve its CPU code to better utilize more CPUs, but this is proving to be extremely difficult. Apparently, it is a complex topic in that more CPUs can increase the nps, yet the playing strength goes down after N+1 CPUs where N is the number of GPUs. It has to do with assembling full batches of work for the GPUs. The experts on the Leela Discord can explain it better. Thus a few 5GHz CPUs for Lc0 are far better than 64+ slower ones. In any case, even if there were no significant hardware handicap, Leela is still currently far behind SF-NNUE.
Hence the CCC cores are 2.6/2.2 * 100 = 118% of the speed or 18% faster.
On the other hand, TCEC uses 4x V100 GPUs whereas CCC is using 2x A100 GPUs.
I do not know enough about the difference between the A100 and V100 to know which system has the advantage there.
The 7H12 has far higher IPC than the TCEC Xeon.
It's not 18% faster, it's faster than that. (With the exception of PEXT because microcoded lol)
Actually, the CPU difference is smaller, since the GPU server uses its own CPU cores, which are:
CPU: Intel(R) Xeon(R) Platinum 8163 CPU @ 2.50GHz, 32 vcores
So they are :
2.5/2.6 * 100 = 96% of the speed or about 4% slower than the AMD cores.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
My conclusion is then, that the GPU systems are very nearly the same speed. Within a few percent perhaps.
On the other hand, I think that the CPU version on CCC is probably stronger.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.