GPU rumors 2021

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

Werewolf wrote: Wed May 22, 2024 11:54 pm [...]
Hehe, back to the 90s with SPARC...IIRC the PTL/FH-Wedel university had an SUN Ultra SPARC running in the data center, with 4 sockets 200 MHz 64-bit, must have been fun, to code a chess engine for that machine back then...
https://www.chessprogramming.org/SPARC#Chess_Programs

AFAIK, Stockfish has no SPARC optimized code, at least not for the SPARC "VIS" SIMD unit in regard of NNUE inference, and I have never seen SF benchmarks for these machines.

As far as I got it, the Fujitsu SPARC64 XII is the last CPU to support Sun Solaris, think of Oracle DB legacy machines for finance and insurance companies:
https://www.fujitsu.com/global/products ... enchmarks/

12 cores, SMT8, 4.25GHz, up to 2 sockets per node with up to 4 nodes:
https://www.fujitsu.com/global/products ... up/m12-2s/

--
Srdja
smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

AMD announced Zen 5 for release in July, AMD claims 16% IPC increase over Zen 4, now 6 instead of 4 ALUs per core, and AVX-512 seems now to be 512-bit wide (dunno if it will clock down under load), desktop Ryzen from 6 to 16 cores, mobile Ryzen with up to 8 cores with XDNA 2 AI engine (NPU/TPU), server version with up to 128 cores, and Zen 5c (less cache, lower freq.) with up to 192 cores.
Zen 4 introduced AVX-512 instructions. AVX-512 capabilities have been expanded with Zen 5 with a doubling of the floating point pipe width to 512-bit. Additionally, there is greater bfloat16 throughput which is beneficial for AI workloads.
https://en.wikipedia.org/wiki/Zen_5

--
Srdja
Werewolf
Posts: 1860
Joined: Thu Sep 18, 2008 10:24 pm

Re: GPU rumors 2021

Post by Werewolf »

smatovic wrote: Tue Jun 04, 2024 9:10 pm AMD announced Zen 5 for release in July, AMD claims 16% IPC increase over Zen 4, now 6 instead of 4 ALUs per core, and AVX-512 seems now to be 512-bit wide (dunno if it will clock down under load), desktop Ryzen from 6 to 16 cores, mobile Ryzen with up to 8 cores with XDNA 2 AI engine (NPU/TPU), server version with up to 128 cores, and Zen 5c (less cache, lower freq.) with up to 192 cores.
Zen 4 introduced AVX-512 instructions. AVX-512 capabilities have been expanded with Zen 5 with a doubling of the floating point pipe width to 512-bit. Additionally, there is greater bfloat16 throughput which is beneficial for AI workloads.
https://en.wikipedia.org/wiki/Zen_5

--
Srdja
Slight shame there's no raise from 16 cores, given that Turin Threadripper won't be out in 2024.
smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

NVLink, InfinityFabric, CXL for GPU-GPU interconnect, now:

Everyone Except Nvidia Forms Ultra Accelerator Link (UALink) Consortium
https://www.hpcwire.com/2024/05/30/ever ... onsortium/
AMD, Broadcom, Cisco, Google, Hewlett Packard Enterprise (HPE), Intel, Meta, and Microsoft announced they have aligned to develop a new industry standard dedicated to advancing high-speed and low-latency communication for scale-up AI Accelerators.
Scale-up via UALink, scale-out via UEC.

I guess we end users will stick on PCIe in our home PCs.

--
Srdja
smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

Werewolf wrote: Wed Jun 05, 2024 4:50 pm [...]
Slight shame there's no raise from 16 cores, given that Turin Threadripper won't be out in 2024.
Hmm, we need to wait for Zen 5 benchmarks with the new AVX-512 unit, but until now, the Zen 3 Ryzen series offer best price/performance ratio (nps per dollar) IMO.

As used machine the Intel Skylake series is still going, it was really a good architecture.

For another non-chess project, I am still evaluating different server setups, scale-up and scale-out, maybe the Ampere Altra as single socket for scale-out, and there are different vendors for scale-up, with multiple sockets, in the market, AMD and ARM offer only max two CPU sockets per node.

--
Srdja
smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

smatovic wrote: Thu Apr 11, 2024 9:36 pm
Werewolf wrote: Thu Apr 11, 2024 9:26 pm For Lc0 it looks like the 4090 will still be king for the next 12 months then.
Yep, I think the Turing RTX 2080 TI is still a good shot in regard of FP16 performance.

--
Srdja
Nvidia RTX 5090 and RTX 5080 maybe this fall, for the X-mas shoppers:
We've spoken with some people as well, and the expectation is that we'll see at least the RTX 5090 and RTX 5080 by the time the holiday season kicks off in October or November.
https://www.tomshardware.com/pc-compone ... ng-we-know

AMD RDNA4 maybe fourth quarter of 2024:
https://www.digitaltrends.com/computing ... ce-rumors/

Intel Arc Battlemage maybe this fall:
https://www.tomshardware.com/pc-compone ... -this-fall

The smallest fab-processes are in high demand, might expect 2nm and 1+nm in the next couple years, maybe when the AI chip demand was satisfied we can expect bigger steps in consumer GPUs horse power and also lower prices.

--
Srdja
smatovic
Posts: 2832
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: GPU rumors 2021

Post by smatovic »

Hehe, IBM is back with IBM Telum II and IBM Spyre AI accelerator :)

Enhancing enterprise AI with the IBM Spyre Accelerator
https://research.ibm.com/blog/spyre-for-z
As the newest AI accelerator, Spyre shares a very similar architecture to that first prototype. Spyre has 32 individual accelerator cores onboard, and contains 25.6 billion transistors using 14 miles of wire. It will be produced using 5 nm node process technology, and each Spyre is mounted on a PCIe card. Cards can be clustered together — for example, a cluster of 8 cards adds 256 additional accelerator cores to a single IBM Z system.
It also opens up how IBM Z can make use of generative AI and watsonx, IBM’s AI and data platform. Spyre brings the ability to run products like watsonx Code Assistant, which allows businesses to modernize code bases on mainframes, with far greater efficacy. You can use generative AI to understand what code is doing in your application, and what needs to be updated, amended, or just removed.
New Telum II Processor and IBM Spyre Accelerator: Expanding AI on IBM Z
https://www.ibm.com/blog/announcement/telum-ii/

And, chip legenda Jim Keller is meanwhile CEO of Tenstorrent with its own "Tensix Cores":

TT-QuietBox
https://tenstorrent.com/hardware/tt-quietbox
The TT-QuietBox Liquid-Cooled Desktop Workstation is a great solution for developers running or testing AI models, or port and develop libraries for HPC. TT-QuietBox is equipped with four Tenstorrent Wormhole™ cards for a total of eight Wormhole™ Tensix Processors.

These processors are connected with a flexible, Ethernet-based mesh topology that can expand to achieve a 96GB memory pool. This empowers TT-QuietBox to run single user/single models up to approximately 80 billion parameters and single/multiple user, multiple models up to approximately 20 billion parameters.
Wormhole
https://tenstorrent.com/hardware/wormhole
The Wormhole™ n150 and n300 PCIe boards are flexible, scalable processors built with Tensix Cores. Each includes a compute unit, network-on-chip, local cache and “baby RISC-V” cores, coalescing in powerful data movement through the chip.
And, SiFive annonced 256-core RISC-V CPU "P870-D" with optional vector-unit:

SiFive Performance P870-D
https://www.sifive.com/cores/performance-p870d
The highest performance P870-D is targeted for datacenter applications which benefit from parallelism, including storage, web servers and video streaming. The P870-D can be used standalone or in conjunction with a member of the SiFive Intelligence Processor Family.

P870-D is fully compliant with the RVA23 RISC-V Instruction Profile. It incorporates a shared cluster cache enabling up to -32 cores to be connected coherently. Use of a CHI bridge expands this to 256 CPU coherent cores, connected either in an SoC or as a set of discrete chiplets.
SiFive Intelligence X390
https://www.sifive.com/cores/intelligence-x390

Things are moving.

--
Srdja
User avatar
towforce
Posts: 11777
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: GPU rumors 2021

Post by towforce »

smatovic wrote: Wed Aug 28, 2024 10:34 pmThings are moving.

The choice is now vast and complex, and the most pressing requirement here is very clear: an AI system to help people choose the technology for their AI building systems.
:?
The simple reveals itself after the complex has been exhausted.
Werewolf
Posts: 1860
Joined: Thu Sep 18, 2008 10:24 pm

Re: GPU rumors 2021

Post by Werewolf »

smatovic wrote: Wed Aug 28, 2024 10:34 pm

Things are moving.

--
Srdja
I'm hearing 5090 in early 2025 and Threadripper 9000 (Turin / Ryzen 5) sooner but with about 10% performance gain over last gen, which isn't that great.
User avatar
towforce
Posts: 11777
Joined: Thu Mar 09, 2006 12:57 am
Location: Birmingham UK

Re: GPU rumors 2021

Post by towforce »

Werewolf wrote: Thu Aug 29, 2024 8:35 pm...and Threadripper 9000 (Turin / Ryzen 5) sooner but with about 10% performance gain over last gen, which isn't that great.

That will be just 1.34e13 flops. Might just as well go back to the abacus. :)
The simple reveals itself after the complex has been exhausted.