Cerebras - but honey, can it play chess?

smatovic · Post by **smatovic** » Sat Sep 04, 2021 9:38 pm

CS2 with 2.4 petabyte RAM module, able to run GPT-3 model with 120 trillion weights, planned for Q4:

https://www.zdnet.com/article/cerebras- ... -networks/

--
Srdja

AdminX · Post by **AdminX** » Sat Sep 04, 2021 9:55 pm

Wow!

smatovic · Post by **smatovic** » Sun Sep 05, 2021 9:42 am

Haha, size matters

--
Srdja

AdminX · Post by **AdminX** » Sun Sep 05, 2021 9:55 am

smatovic wrote: ↑Sun Sep 05, 2021 9:42 am Haha, size matters
--
Srdja

Haha, That' s what she said. Reminds me of and old Richard Pryor Joke. 'If you had two more inches of ...'

***WARNING ADULT AUDIO CONTENT***

Check the 1:48 minute mark:

smatovic · Post by **smatovic** » Sun Sep 05, 2021 10:14 am

AdminX wrote: ↑Sun Sep 05, 2021 9:55 am
smatovic wrote: ↑Sun Sep 05, 2021 9:42 am Haha, size matters
--
Srdja
...
That' s what she said.
...

Haha

--
Srdja

towforce · Post by **towforce** » Sun Sep 05, 2021 5:06 pm

smatovic wrote: ↑Sat Sep 04, 2021 9:38 pm CS2 with 2.4 petabyte RAM module, able to run GPT-3 model with 120 trillion weights, planned for Q4:

https://www.zdnet.com/article/cerebras- ... -networks/

--
Srdja

There are already petascale computers you can buy - link. There are computers beyond that, but you cannot buy them "off the shelf".

You asked whether it can play chess: I've been looking everywhere, and I've found a program I think it's powerful to run - link.

smatovic · Post by **smatovic** » Sun Sep 05, 2021 8:27 pm

towforce wrote: ↑Sun Sep 05, 2021 5:06 pm ...

Did you read the article? This machine was designed for one purpose, to train and inference large neural networks. It competes with clusters of thousands of GPUs, and according to the PR successful, it moves the bar from 1.6 trillion neural network parameters to 120 trillion, ~100x. If the DoE says it rocks then it does

--
Srdja

towforce · Post by **towforce** » Sun Sep 05, 2021 9:53 pm

smatovic wrote: ↑Sun Sep 05, 2021 8:27 pmDid you read the article? This machine was designed for one purpose, to train and inference large neural networks. It competes with clusters of thousands of GPUs, and according to the PR successful, it moves the bar from 1.6 trillion neural network parameters to 120 trillion, ~100x. If the DoE says it rocks then it does

--
Srdja

It's a cluster of 192 computers, whereas the Nvidia DGX A100 is a single computer. They can improve the efficiency of the clustering, however, by taking advantage of sparse weights (hence sparse gradients):

smatovic · Post by **smatovic** » Sun Sep 05, 2021 10:10 pm

towforce wrote: ↑Sun Sep 05, 2021 9:53 pm It's a cluster of 192 computers, whereas the Nvidia DGX A100 is a single computer. They can improve the efficiency of the clustering, however, by taking advantage of sparse weights (hence sparse gradients):
...

Maybe dig deeper into data/model/layer parallelism, how the WSE runs matrices on GB of SRAM fed by the main memory module. A100 has sparsity acceleration too AFAIK.

***edit**

A cluster of ~256 DGX2 (8xA100 each) runs a ~1 trillion model, a CS-2 setup with 192 WSE-2 runs 120 trillion, which one do you prefer?

--
Srdja

smatovic · Post by **smatovic** » Sun Sep 05, 2021 11:07 pm

Followup: the Nvidia DGX2 has 16xV100, the Nvidia DGX A100 has 8xA100.

--
Srdja

Cerebras - but honey, can it play chess?

Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?

Re: Cerebras - but honey, can it play chess?