Building a machine to run NN engines

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Building a machine to run NN engines

Post by Zenmastur »

I'm getting ready to build a new computer for playing chess. Since NN engines seem to be all the rage I started looking in to what I would build if NN engines were the targeted application.

I ran across a dual CPU 4U rack server that can host 10 double wide GPU's in a single root complex. It was billed as a tensor deep learning machine. So I started looking at top of line GPU's. It turn's out that AMD is slated to release their top of line (code name Navi 20) GPUs in November of this year. I didn't think much about it as NVidia seems to be king of the hill. But, I looked them up none the less. Internally this GPU is called by those that work on it "Nvidia Killer".

It's official name is "Radeon Instinct Mi60" and it has some very impressive specifications: FP64 7.4 TFLOPS, FP32 14.7 TFLOPS, FP16 29.5 TFLOPS, INT8 59 TFLOPS, and INT4 118 TFLOPS. But it doesn't stop there, it has 32GB of HBM memory (4096 bit interface) delivering 1TB/sec memory bandwidth to the GPU, and it has 2 Infinity Fabric links to the CPU for a bandwidth of 200GB/sec.

Now, if you put 10 of these in a single machine, ( and yes I checked if they would fit and the server would have enough power to feed them, the 4U server mentioned above does) it would make an NN monster. Perfect for advanced NN chess engine. It will be interesting to see what kind of price they put on these cards when they're announced. I suspect they'll be several thousand dollars a piece. Even so, for HPC computing I don't think NVidia has anything that can touch this card. It also has a little sister, the Mi50.

Regards

Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
marsell
Posts: 106
Joined: Tue Feb 07, 2012 11:14 am

Re: Building a machine to run NN engines

Post by marsell »

You will contribute to global warming. That's for 100% sure.
Zenmastur
Posts: 919
Joined: Sat May 31, 2014 8:28 am

Re: Building a machine to run NN engines

Post by Zenmastur »

The is no spoon.
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Building a machine to run NN engines

Post by Laskos »

Run NN engines or use for training, machine learning in general? Is the budget an issue?

For pure NN engine play and if the budget is of some consideration, until Lc0 engine is fixed with regard to parallelization and NPS bottleneck, even 4 GPUs seem an overkill. What I would have for pure strength: an i9 9900K OC-ed to almost 5 GHz (doable with a good cooler), the CPU frequency is important for Lc0. 2 RTX 2080 Ti GPUs, again OC-ed by some 15%. Good, large cooler there too. Large case, big power supply, 2 large fans and 3-4 smaller ones. You will have a stable 80+ kNPS with a T40 20b net, almost hitting the Lc0 NPS bottleneck, and 50+ kNPS with T60 24b net. The NPS is as high as 4 GPU non-OC-ed setup on a slower CPU, but the parallelization above 2 GPUs for now is bad not only NPS-wise, but also strength-wise for same NPS. So, until Lc0 team solves these issues, your machine would probably be close to the best strength-wise using Lc0. With late T40 nets, it will beat even say 192 core Stockfish in regular matches. T60 nets are expected to be even stronger on such hardware and longer than bullet TC, but we have to wait 2-3 months before they reach top strength (JHorthos 24b net is already very strong). 3 CPU threads would be enough (in fact, more would probably harm strength-wise). Don't forget some 64 GB of fast RAM and an SSD for system cache. All in all, below $4000, if it matters to you.
Last edited by Laskos on Sat Aug 10, 2019 10:24 am, edited 1 time in total.
smatovic
Posts: 2645
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: Building a machine to run NN engines

Post by smatovic »

Hmm, the Radeon Instinct MI60 and MI50 have been already released in 2018, but I
haven't seen them in the wild, they are 7 nm but based on Vega GCN architecture,
the upcoming AMD gpus in autumn are supposed to be Navi RDNA architecure, and
afaik these still have no counterpart to Nvidia's TensorCores, so probably they
won't rock as much as you think...

furthermore I agree with Laskos, LC0 had or has issues with more than 2 gpus.

--
Srdja
dragontamer5788
Posts: 201
Joined: Thu Jun 06, 2019 8:05 pm
Full name: Percival Tiglao

Re: Building a machine to run NN engines

Post by dragontamer5788 »

smatovic wrote: Sat Aug 10, 2019 10:23 am Hmm, the Radeon Instinct MI60 and MI50 have been already released in 2018, but I
haven't seen them in the wild, they are 7 nm but based on Vega GCN architecture,
the upcoming AMD gpus in autumn are supposed to be Navi RDNA architecure, and
afaik these still have no counterpart to Nvidia's TensorCores, so probably they
won't rock as much as you think...

furthermore I agree with Laskos, LC0 had or has issues with more than 2 gpus.

--
Srdja
I agree 100%.

Note that the RTX 2080 Ti achieves 110 FP16 TFLOPS if using 4x4 Tensor operations, which is far greater than the 29.5 TFLops of the Mi60. The MI60 packed FP16, INT8, and INT4 operations are hard to use in practice. While RTX 2080 Ti already has the libraries built into Tensorflow.

I think the MI60 is a better compute card: with huge single-precision float and double-precision float support. Memory bandwidth is useful in general compute as well. But if you're just doing tensor-ops for neural nets, you really want NVidia's tensor cores.