GPU rumors 2021
Moderators: hgm, chrisw, Rebel
-
- Posts: 1927
- Joined: Thu Sep 18, 2008 10:24 pm
Re: GPU rumors 2021
Seems like next-gen Nvidia will be 2025
-
- Posts: 3020
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: GPU rumors 2021
A chess engine for the Nvidia Grace Hopper Superchip (CPU+GPU+HBM) would be fun though.
NVIDIA Grace Hopper Superchip
https://www.nvidia.com/en-us/data-cente ... superchip/
--
Srdja
NVIDIA Grace Hopper Superchip
https://www.nvidia.com/en-us/data-cente ... superchip/
--
Srdja
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: GPU rumors 2021
Dont even need to wait for new hardware. You could RIGHT NOW sit down and be the first to leverage Nvidia Cutlass 1-bit matrix math which runs at many PetaOps/s. Not accessible from cuda - need cutlass.
IMO you dont even need a supercomputer to beat current chess engines with the right approach. You need to mix the power of incremental networks and the 60x higher unconditional integer performance of a 40xx compared to you run of the mill 16core 7950x.
I tried on the 30xx series but 8x8 gemmShape is not supported. You would need to work on 4 bitboards at once or interleave the bits in the right shape. Random sample of how that looks like. cutlass::uint1b_t is the keyword.
Code: Select all
using Gemm = cutlass::gemm::device::Gemm<
cutlass::uint1b_t, cutlass::layout::RowMajor, cutlass::uint1b_t,
cutlass::layout::ColumnMajor, ElementOutput, cutlass::layout::RowMajor,
ElementAccumulator, cutlass::arch::OpClassTensorOp, cutlass::arch::Sm75,
cutlass::gemm::GemmShape<128, 256, 512>,
cutlass::gemm::GemmShape<64, 64, 512>,
cutlass::gemm::GemmShape<8, 8, 128>,
cutlass::epilogue::thread::LinearCombination<
ElementOutput, 128 / cutlass::sizeof_bits<ElementOutput>::value,
ElementAccumulator, ElementCompute>,
cutlass::gemm::threadblock::GemmIdentityThreadblockSwizzle<>, 2, 128, 128,
false, cutlass::arch::OpXorPopc>;
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
- Posts: 3020
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: GPU rumors 2021
Looking forward for the Gigantua on a GPU chess engine!
--
Srdja
-
- Posts: 1927
- Joined: Thu Sep 18, 2008 10:24 pm
Re: GPU rumors 2021
Yep, this could be fun.smatovic wrote: ↑Thu Sep 07, 2023 2:35 pm A chess engine for the Nvidia Grace Hopper Superchip (CPU+GPU+HBM) would be fun though.
NVIDIA Grace Hopper Superchip
https://www.nvidia.com/en-us/data-cente ... superchip/
--
Srdja
It has now come out the about to be released Threadripper 96 core CPU maxes out at 3.2 GHz all core turbo - 200 MHz slower than the top EPYC - so it'll be interesting to see what OS Nvidia's CPU runs on.
-
- Posts: 3020
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: GPU rumors 2021
I guess it is now about Wattage on CPUs, TDP, the Nvidia Grace Hopper chip operates from 450 to 1000 watt, depending on air or water cooling I guess, the Zen4 Epyc goes up to 400 watt...I've read you can cool up to 500W with air cooling, beyond that liquid/water is in need...
the Nvidia DGX series used in the past a customized Ubuntu Linux OS, dunno about running Ubuntu on ARM v9 architecture...
https://en.wikipedia.org/wiki/Nvidia_DGX
--
Srdja
-
- Posts: 1927
- Joined: Thu Sep 18, 2008 10:24 pm
Re: GPU rumors 2021
From Tom's hardware, maybe this doesn't affect Lc0 usage?smatovic wrote: ↑Thu Sep 14, 2023 11:08 amI guess it is now about Wattage on CPUs, TDP, the Nvidia Grace Hopper chip operates from 450 to 1000 watt, depending on air or water cooling I guess, the Zen4 Epyc goes up to 400 watt...I've read you can cool up to 500W with air cooling, beyond that liquid/water is in need...
the Nvidia DGX series used in the past a customized Ubuntu Linux OS, dunno about running Ubuntu on ARM v9 architecture...
https://en.wikipedia.org/wiki/Nvidia_DGX
--
Srdja
However, while the Grace chips are ultra-performant and efficient in some workloads, Nvidia isn't aiming them at the general-purpose server market. Instead, the company has tailored the chips for specific use cases, like AI and cloud workloads that favor superior single-threaded and memory processing performance in tandem with excellent power efficiency.
https://www.tomshardware.com/news/nvidi ... ace-delay
-
- Posts: 3020
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: GPU rumors 2021
Idk for sure, but to make use of CPU with high-bandwidth GPU interconnect and HBM you might want to consider an engine in-between Lc0 and Stockfish.
--
Srdja
-
- Posts: 3020
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: GPU rumors 2021
Straggler...
Amazon Graviton 4 with 96 cores and SVE2 vector-unit is available in the cloud:
https://en.wikipedia.org/wiki/AWS_Graviton#Graviton4
Chinese GPUs up and running, Moore Threads:
https://en.wikipedia.org/wiki/Moore_Threads
Chinese Moore Threads Unveils Chunxiao GPU: 4,096 SPs, GDDR6, PCIe Gen 5
https://www.tomshardware.com/news/moore ... unxiao-gpu
Chinese-Made PCIe 5.0 Gaming GPU Benchmarks Emerge (Updated)
https://www.tomshardware.com/news/chine ... rks-shared
--
Srdja
Amazon Graviton 4 with 96 cores and SVE2 vector-unit is available in the cloud:
https://en.wikipedia.org/wiki/AWS_Graviton#Graviton4
Chinese GPUs up and running, Moore Threads:
https://en.wikipedia.org/wiki/Moore_Threads
Chinese Moore Threads Unveils Chunxiao GPU: 4,096 SPs, GDDR6, PCIe Gen 5
https://www.tomshardware.com/news/moore ... unxiao-gpu
Chinese-Made PCIe 5.0 Gaming GPU Benchmarks Emerge (Updated)
https://www.tomshardware.com/news/chine ... rks-shared
--
Srdja
-
- Posts: 12097
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
- Full name: Graham Laight
Re: GPU rumors 2021
smatovic wrote: ↑Sun Dec 03, 2023 7:16 am Straggler...
Amazon Graviton 4 with 96 cores and SVE2 vector-unit is available in the cloud:
https://en.wikipedia.org/wiki/AWS_Graviton#Graviton4
Chinese GPUs up and running, Moore Threads:
https://en.wikipedia.org/wiki/Moore_Threads
Chinese Moore Threads Unveils Chunxiao GPU: 4,096 SPs, GDDR6, PCIe Gen 5
https://www.tomshardware.com/news/moore ... unxiao-gpu
Chinese-Made PCIe 5.0 Gaming GPU Benchmarks Emerge (Updated)
https://www.tomshardware.com/news/chine ... rks-shared
--
Srdja
Very interesting.
Slightly off topic (about an SOC rather than GPUs), but Huawei have found a way to create an SOC with modern performance using old fabrication technology. The Kirin 9000s is comparable to the Snapdragon 888 (comparison link: to avoid confusion, HiSilicon is owned by Huawei), which was launched 2 years and 9 months earlier. This isn't going to lead to higher performance, but it could lead to greater supply, and hence lower prices.
Article about how Huawei made this high performance SOC with old fabrication technology https://www.ft.com/content/327414d2-fe1 ... 3cdb94c7e1
Want to attract exceptional people? Be exceptional.