CS2 with 2.4 petabyte RAM module, able to run GPT-3 model with 120 trillion weights, planned for Q4:
https://www.zdnet.com/article/cerebras- ... -networks/
--
Srdja
Cerebras - but honey, can it play chess?
Moderator: Ras
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
-
- Posts: 6363
- Joined: Mon Mar 13, 2006 2:34 pm
- Location: Acworth, GA
Re: Cerebras - but honey, can it play chess?



"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Cerebras - but honey, can it play chess?
Haha, size matters 

--
Srdja


--
Srdja
-
- Posts: 6363
- Joined: Mon Mar 13, 2006 2:34 pm
- Location: Acworth, GA
Re: Cerebras - but honey, can it play chess?

***WARNING ADULT AUDIO CONTENT***
Check the 1:48 minute mark:
"Good decisions come from experience, and experience comes from bad decisions."
__________________________________________________________________
Ted Summers
__________________________________________________________________
Ted Summers
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Cerebras - but honey, can it play chess?
Haha

--
Srdja
-
- Posts: 12514
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
- Full name: Graham Laight
Re: Cerebras - but honey, can it play chess?
smatovic wrote: ↑Sat Sep 04, 2021 9:38 pm CS2 with 2.4 petabyte RAM module, able to run GPT-3 model with 120 trillion weights, planned for Q4:
https://www.zdnet.com/article/cerebras- ... -networks/
--
Srdja
There are already petascale computers you can buy - link. There are computers beyond that, but you cannot buy them "off the shelf".
You asked whether it can play chess: I've been looking everywhere, and I've found a program I think it's powerful to run - link.

Human chess is partly about tactics and strategy, but mostly about memory
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Cerebras - but honey, can it play chess?
Did you read the article? This machine was designed for one purpose, to train and inference large neural networks. It competes with clusters of thousands of GPUs, and according to the PR successful, it moves the bar from 1.6 trillion neural network parameters to 120 trillion, ~100x. If the DoE says it rocks then it does

--
Srdja
-
- Posts: 12514
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
- Full name: Graham Laight
Re: Cerebras - but honey, can it play chess?
smatovic wrote: ↑Sun Sep 05, 2021 8:27 pmDid you read the article? This machine was designed for one purpose, to train and inference large neural networks. It competes with clusters of thousands of GPUs, and according to the PR successful, it moves the bar from 1.6 trillion neural network parameters to 120 trillion, ~100x. If the DoE says it rocks then it does
--
Srdja
It's a cluster of 192 computers, whereas the Nvidia DGX A100 is a single computer. They can improve the efficiency of the clustering, however, by taking advantage of sparse weights (hence sparse gradients):

Human chess is partly about tactics and strategy, but mostly about memory
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Cerebras - but honey, can it play chess?
Maybe dig deeper into data/model/layer parallelism, how the WSE runs matrices on GB of SRAM fed by the main memory module. A100 has sparsity acceleration too AFAIK.
***edit**
A cluster of ~256 DGX2 (8xA100 each) runs a ~1 trillion model, a CS-2 setup with 192 WSE-2 runs 120 trillion, which one do you prefer?
--
Srdja
-
- Posts: 3331
- Joined: Wed Mar 10, 2010 10:18 pm
- Location: Hamburg, Germany
- Full name: Srdja Matovic
Re: Cerebras - but honey, can it play chess?
Followup: the Nvidia DGX2 has 16xV100, the Nvidia DGX A100 has 8xA100.
--
Srdja
--
Srdja