Grace Hopper Superchip

Dann Corbit · Post by **Dann Corbit** » Wed Apr 06, 2022 3:55 pm

https://www.tomshardware.com/news/nvidi ... -epyc-rome

Ras · Post by **Ras** » Thu Apr 07, 2022 7:57 am

So they are comparing their non-released chip targeted for next year with AMD's old Zen2 Rome chips from 2019. Next year's computer chips will be faster and more efficient than those from three years ago? Who would have thought!

They don't compare with Zen3 Milan from 2021 or even Milan-X from this year, and by the time Nvidia actually releases the chip, it will have to face Genoa. That will not only be Zen4, but also be produced in 5nm instead of 7nm, which will improve energy efficiency.

The old Rome chips are what Nvidia is currently using in their DGX A100 so that this comparison does make some sense, but it's also highly misleading the the same time.

Vinvin · Post by **Vinvin** » Thu Apr 07, 2022 6:55 pm

More numbers here : "NVIDIA Grace 144 Core ARM CPU Is 14% Slower Than Dual 128 Core AMD EPYC 7763 CPUs In Spec Integer Benchmark" https://wccftech.com/nvidia-grace-144-c ... benchmark/
But price and power consumption are also interresting to be compared.

dangi12012 · Post by **dangi12012** » Thu Apr 07, 2022 8:48 pm

Playing the "big number game" yields no improvements.
What brings computerchess forwards are algorithmic improvements or hardware improvements like new intrinsics or switching architecture.

Scaling to more cores is just a numbers game.
I am advocating for additional engine measurements:
performance/Watt
performance/Dollar

This could allow more direct comparisons. Because for example AVX reduces clocks and increases wattage - in some cases is bad to enable (AVX512). Then you could compare exotic hardware, homebrew, FPGA, GPU and CPU Engines more fairly.

smatovic · Post by **smatovic** » Fri Apr 08, 2022 8:01 am

dangi12012 wrote: ↑Thu Apr 07, 2022 8:48 pm [...]
I am advocating for additional engine measurements:
performance/Watt
performance/Dollar
[...]

Hmm, if you mean Elo/Dollar than SF13 on an Pentium III from the 90s would be the best shot I guess, I like what CCRL, TCEC, CCC do, put some hardware (CPU, GPU, xy) into a box, and then the programmer has to squeeze the juice out of it....

The Nvidia CPU makes only sense in context of NVLink with unified/coherent memory across additional Nvidia GPUs.

edit:
Would be interesting to look at the new SVE2 SIMD units for NNUE and alike.

edit:

An interesting metric IMO would be something like: Elo/Transistorcount*Frequency

--
Srdja

smatovic · Post by **smatovic** » Fri Apr 08, 2022 8:40 am

smatovic wrote: ↑Fri Apr 08, 2022 8:01 am edit:

An interesting metric IMO would be something like: Elo/Transistorcount*Frequency

An interesting metric IMO would be something like: Elo/(Transistorcount*Frequency)

--
Srdja

benb · Post by **benb** » Sat Apr 09, 2022 8:08 am

There are several benchmark programs that may be pertinent. Dhrystone has been around a long time, and looking around there are several other more recent ones. I give a few Wikipedia links below.

It seems that ELO/dhrystone or ELO/some-benchmark-program would be useful for comparing engines, and the benchmarks themselves good for comparing hardware.

Some of these measure (integer) CPU, floating point (Whetstone is an older one), and GPU performance separately or only one of these, others may give a single score for some sort of average of all of these. The integer/general CPU performance appears most pertinent to chess engines. The third link actually mentions chess programs.

https://en.wikipedia.org/wiki/Dhrystone
https://en.wikipedia.org/wiki/Coremark
https://en.wikipedia.org/wiki/Standard_ ... orporation
https://en.wikipedia.org/wiki/Geekbench

Vinvin · Post by **Vinvin** » Wed Apr 13, 2022 1:20 am

smatovic wrote: ↑Fri Apr 08, 2022 8:40 am
smatovic wrote: ↑Fri Apr 08, 2022 8:01 am edit:

An interesting metric IMO would be something like: Elo/Transistorcount*Frequency
An interesting metric IMO would be something like: Elo/(Transistorcount*Frequency)

--
Srdja

Why use "Transistorcount" ?

smatovic · Post by **smatovic** » Wed Apr 13, 2022 6:05 am

Vinvin wrote: ↑Wed Apr 13, 2022 1:20 am
smatovic wrote: ↑Fri Apr 08, 2022 8:40 am
An interesting metric IMO would be something like: Elo/(Transistorcount*Frequency)

--
Srdja
Why use "Transistorcount" ?

Cos you can not compare different hardware-architectures by instruction troughput alone, CPU, GPU, TPU, FPGA, ASIC, these all were used over history for computer chess, you can use transistors for scalar ALUs, or vector ALUs, or matrix ALUs, or caches, or branch-prediction/speculative execution, the question here is, which program makes most out of the transistors present in an hardware-architecture. With

Elo/(Transistorcount*Frequency)

you could compare engines for the 6502 with 68k with x64 with GPU, anything which implements a chess engine via transistors (or relays and tubes) over history, even Belle.

Elo/Watt does not fit, cos the new, smaller fabrication-processes consume less power.

Elo/Dollar does not fit, cos a used PC from the 90s would outperform anything from today.

Elo/MIPS is closer, but does not fit for comparing general purpose CPUs with TPUs/FPGAs/ASICs.

You can also add used SRAM and DRAM into this metric.

--
Srdja

smatovic · Post by **smatovic** » Wed Apr 13, 2022 8:18 am

Another thing is some kind of space-time tradeoff going on, opening books are precomputed, EGTBs are precomputed, neural networks are also precomputed, PSQTs are precomputed by tuning....you could compare an 32-men EGTB and a mega neural network with perfect play by space or by time for precomputation, or alike.

--
Srdja

Grace Hopper Superchip

Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip

Re: Grace Hopper Superchip