AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:20 pm

Guenther wrote:
Ras wrote:
Lyudmil Tsvetkov wrote:Would not SF on the very same hardware, if adapted, be still 400 elos stronger?
No, it wouldn't, for the same reason that Stockfish doesn't harness the power of available GPUs. These TPUs are quite a different design than CPUs. TPUs are a bit like GPUs modified in hardware to be more efficient in neural networks instead of graphics.

The big thing here is that they have a truck load of simple modules than can perform identical operations on different data at the same time. Perfect for neural networks - and for graphics, which is why graphic hards have long been used for neural networks purposes.

By contrast, conventional CPUs are good at performing different and complex operations on input data at a high rate. That is good for if/then/else-branching, which chess engines essentially do.

Software that was designed for CPUs cannot take advantage of TPUs. The TPU hardware would be useless for Stockfish. The other way round is also not that promising: trying to run a neural network on a CPU doesn't yield performant results.
Rasmus do you really think he will understand that after all he said before?

I did not even really read your comment.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:24 pm

Ras wrote:
Guenther wrote:Rasmus do you really think he will understand that after all he said before?
I've never seen Lyudmil posting derailing stuff. The main point isn't even technical understanding, it's the perspective that equates computing with doing stuff on x86-like CPUs (or ARMs or whatever). It takes a while to let it sink in that this is a completely different way of computing.

I'm also somewhat astonished about the whole discussion whether the hardware was fair or not. As if Stockfish's defeat had been the point here. We're looking at nothing less than a revolution in computing, and people discuss whether some change here and there might have given a handful of Elo more to Stockfish.

Those are 2 completely different things: revolution in computing and revolution in artificial intelligence.
You rightly state this is a revolution in computing, but why do they only stress the AI part then?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:27 pm

EvgeniyZh wrote:
Lyudmil Tsvetkov wrote:
EvgeniyZh wrote:
Adam Hair wrote:
EvgeniyZh wrote: You claimed TPU is 180 TOPS, while it is 180 TFLOPS per pod of four TPU, thus your evaluation 4 times higher than it should be.
From what I can find, Google seems to be calling a motherboard containing 4 ASICs a Cloud TPU, and a TPU pod is 64 Cloud TPUs.
Yeah, well, per device with 4 TPUs.

BTW, new gen NVIDIA GPUs are claimed to have 100+ FP16 TFLOPS, which would made AlphaZero on consumer PC reality. 2xTitan V will match 4 TPUs, however, that needs some test of course.
Would not SF on the very same hardware, if adapted, be still 400 elos stronger?
Why are you mentioning Alpha at all then to me?

Just tell, hey, guys, we have NEW HARDWARE.

They want to sell their intelligence, which they actually don't have.
Once again, compatible performance is available if not for consumers, but for universities and labs. Stockfish just cannot be adopted to this new hardware. In fact, modern GPU has around order of magnitude more FLOPS than CPU. However, nobody yet ported chess engines to GPU. Because it is very hard, if possible at all. And yes, Google solved the problem apropos, chess isn't purpose of their not-so-new hardware.

Lyudmil Tsvetkov wrote:
BeyondCritics wrote:
cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..
You lost that bet
AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.
...
At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.
This is done for training, not for play, so it is not actually a search, but a tuning method.
Why everyone confuses tuning with search?
They are still doing alpha-beta.
They are not doing alpha-beta.

It should not be that hard, when you have the blueprint.
Not harder than building an H-bomb, anyway.

MCTS is not a search technique, but a tuning method.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:34 pm

lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
Lyudmil Tsvetkov wrote:
lkaufman wrote:
EvgeniyZh wrote:
Milos wrote:
clumma wrote:
Milos wrote:4 hours my ass (pardon my french).
Far fewer transistors and joules were used training AlphaZero than have been used training Stockfish. You can soon rent those TPUs on Google's cloud, or apply for free access now, so stop complaining. Furthermore it's an experimental project in early days and performance is obviously not optimal, so all the 'but-but-but 30 Elo because they used SF 8 instead of SF 8.00194' sounds really dumb.

Days of alpha-beta engines have come to an abrupt end.

-Carl
Sorry, that is pretty childish rent.
Google is obviously comparing apples and oranges and again doing marketing stunt and ppl are falling for it.
Days of Alpha0 on normal hardware are years away. But keep on dreaming, no one can take that from you.

P.S. Just as a small comparison. leelazero open source project trying to replicate alpha0 in Go, took 1 month to get the same games as AG0 got in 3 hours, that with constant 1000 volunteers.
For chess it would take even more.
Training AlphaZero would take tons of time. Just like creating SF from 0. However, running it took 4 TPU, which is comparable to whats available to (rich) consumers - you can get 6-8 NVIDIA V100 which would get you similar performance.
To me this is the most informative post in the whole thread, assuming it is accurate (I know nothing about TPUs). The only reasonable comparison I can think of between the AlphaZero hardware and the Stockfish hardware is cost of equivalent machines. It doesn't matter to me how much hardware was used to reach the current level of strength for both engines, just whether the playing conditions were fair. You seem to be implying that comparable hardware to the 4 TPUs would cost no more (maybe much less?) than the sixty-four core machine used by SF. Is this correct? I'm asking to learn, not making a claim myself either way.

The other conditions were of course not "fair", but reasonable given that AlphaZero only trained for a few hours. I suppose if Stockfish used a good book, was allowed to use its time management as if the time limit were pure increment, and used the latest dev. version, the match would have been much closer, but probably (judging by the infinite win to loss ratio and the actual games) SF would have still lost. The games were amazing.

Bottom line, assuming the comparable cost claim is accurate: If Google wants to optimize the software for a few weeks and sell it, rent it, or give it away, we have a revolution in computer chess. But my guess is that they won't do this, in which case the revolution may be delayed a couple years or so.
Larry, what kind of revolution, this is 30/1 hardware advantage.
Alpha is currently at 2850 level.
Based on the estimated $60k price for equivalent hardware vs. maybe $20k for the 64 core SF machine (my guess) it would be 3 to 1, not 30. The actual hardware used by Alpha would be useless for SF, so you can't compare the hardware any other way than by price, I think. It sounds like the cost of the type of hardware needed for Alpha is expected to plummet while the cost of normal CPUs just trends slightly lower. We already see the same thing in GO. Leela (top GO engine for pc, and free like SF) plays a stone or so stronger with even a cheap GPU than without one. So while Alpha might only play at 2850 level on your laptop, it might be super-strong in a year or so on something many people could afford. But if Google doesn't release it, that won't happen.
Calculating performance by price? Seems like a very unscientific approach.
Maybe 60k is the price for just one TPU, while it used 4.
Why do you leave SMP inefficiencies aside?
What about the much bigger memory available to Alpha?
What about the book it had based on human games?
What about the fixed TC?
And the hash?

It is not a single thing, but 10 things that favour Alpha, so no one can convince me the real advantage was lower than 30/1.

No, they won't release it, because they will not make much more progress in the future.

And all those, who will be able to buy such hardware, will certainly not be chess players or even programmers, so how could that benefit chess or programming in any way?
I agree that the test was far from fair; another issue that hurt SF was that it was apparently run on 64 threads on a 32 core machine, but it would play much better running just 32 threads on that machine, since there is almost no SMP benefit to offset the roughly 5 to 3 slowdown. Also the cost ratio was more than I said; I was thinking it was a 64 core machine, but a 32 core machine costs about half what I guessed. So maybe all in all 30 to 1 isn't so bad an estimate. On the other hand I think you underestimate how much it could be further improved.

As for usefulness, it is probable that suitable hardware (maybe not fully equal, but reasonably close) for AlphaZero will be available for just a few thousand dollars in the near future, maybe no more than an i9 machine costs now. So I think it would be a big deal if Google released it after further tuning, but that's not likely I think.

Milos estimated the price of 4TPUs currently at over $250 000, and I would be surprised if it costs less.
Well, we will talk in a year's time again, to see how much they have advanced.
So far, in 70 years of computer chess development, the only paradigm for improvement has been, excluding hardware, more sophisticated search and more sophisticated eval. What makes you think this will change in the future?
From a human standpoint, a person who has a more refined evaluation and a more sophisticated search should also perform better.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:37 pm

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
tmokonen wrote:
Lyudmil Tsvetkov wrote:Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.
It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.
Just another crappy, misinformed, false bravado post from a guy who quit his job to pursue the quixotic dream of finding the perfect old-school end point evaluation for an alpha beta searcher. He can't accept the fact that his years of painstaking effort have been rendered moot by a project that was just a "meh, let's spend a few hours and see what happens" lark by the team that already conquered Go, a much more complex game than chess.
http://davidsmerdon.com/?p=1970

Go is much simpler than chess, Go's evaluation patterns are exponentially fewer(1000/1) than those in chess.
So you are really very bad at basic knowledge and etiquette, lad.

Alpha is 1850 currently, and will stay like that, a weak engine running on tremendous hardware.

My project will still conquer the world.
1850 engine cannot beat stockfish regardless of hardware.

Even if you assume 100 elo per doubling then you need to be 1000 times faster only to get 2850 that is clearly weaker than top programs and Alpha certainly did not have 1000:1 hardware advantage.

Was not Deep Blue a 2600 engine?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Sat Dec 09, 2017 4:41 pm

IanO wrote:
Lyudmil Tsvetkov wrote:
BeyondCritics wrote:
cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..
You lost that bet
AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.
...
At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.
This is done for training, not for play, so it is not actually a search, but a tuning method.
Why everyone confuses tuning with search?
They are still doing alpha-beta.
Wrong. Unless stated otherwise, the player is identical to the trainer. That is one of the reasons this is such an interesting advance, both a new eval and tuning method and a previously discarded search method.

Your statement is just based on their claims.
Could you explain this MTCS in a bit more detail to me, to tell you why it is simple alpha-beta.
Alpha-beta is synonynous with picking the best move; an aprroach that does not pick the best move is simply impossible theoretically.
Again, why don't you think a bit: 2800 elo on single core, what kind of a breakthrough is that?

Uri Blass · Post by **Uri Blass** » Sat Dec 09, 2017 5:54 pm

Ras wrote:
Lyudmil Tsvetkov wrote:Would not SF on the very same hardware, if adapted, be still 400 elos stronger?
No, it wouldn't, for the same reason that Stockfish doesn't harness the power of available GPUs. These TPUs are quite a different design than CPUs. TPUs are a bit like GPUs modified in hardware to be more efficient in neural networks instead of graphics.

The big thing here is that they have a truck load of simple modules than can perform identical operations on different data at the same time. Perfect for neural networks - and for graphics, which is why graphic hards have long been used for neural networks purposes.

By contrast, conventional CPUs are good at performing different and complex operations on input data at a high rate. That is good for if/then/else-branching, which chess engines essentially do.

Software that was designed for CPUs cannot take advantage of TPUs. The TPU hardware would be useless for Stockfish. The other way round is also not that promising: trying to run a neural network on a CPU doesn't yield performant results.

Is there a problem to emulate alpha zero on a CPU(same algorithm on a CPU)?

I do not care if it is going to be 10 times slower and most engines cannot beat stockfish even with 10:1 time handicap.

Uri Blass · Post by **Uri Blass** » Sat Dec 09, 2017 5:57 pm

Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
tmokonen wrote:
Lyudmil Tsvetkov wrote:Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.
It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.
Just another crappy, misinformed, false bravado post from a guy who quit his job to pursue the quixotic dream of finding the perfect old-school end point evaluation for an alpha beta searcher. He can't accept the fact that his years of painstaking effort have been rendered moot by a project that was just a "meh, let's spend a few hours and see what happens" lark by the team that already conquered Go, a much more complex game than chess.
http://davidsmerdon.com/?p=1970

Go is much simpler than chess, Go's evaluation patterns are exponentially fewer(1000/1) than those in chess.
So you are really very bad at basic knowledge and etiquette, lad.

Alpha is 1850 currently, and will stay like that, a weak engine running on tremendous hardware.

My project will still conquer the world.
1850 engine cannot beat stockfish regardless of hardware.

Even if you assume 100 elo per doubling then you need to be 1000 times faster only to get 2850 that is clearly weaker than top programs and Alpha certainly did not have 1000:1 hardware advantage.
Was not Deep Blue a 2600 engine?

Deep blue was a special hardware designed for chess so it was not an engine.

I believe that
Deep Blue with 200,000,000 nodes per second lose today against Stockfish even if stockfished run on a very slow hardware so it search only 200,000 nodes per second.

Jesse Gersenson · Post by **Jesse Gersenson** » Sat Dec 09, 2017 6:04 pm

Uri Blass wrote:[snip]Alpha certainly did not have 1000:1 hardware advantage.[/snip]

What are you basing that on?

A dual-socket 16-core xeon e5-26xx v4 is about 1-1.5 teraflops.
1 TPU is 180 teraflops. AlphaZero used 4 TPU's and a cpu.

Regardless, a stunning accomplishment - a sharp turn towards the future...there will be many similar turns in the coming years.

Ras · Post by **Ras** » Sat Dec 09, 2017 6:14 pm

Lyudmil Tsvetkov wrote:That is why I said 'adapted'.

Only that "adapted" would mean "complete rewrite", and then it wouldn't be Stockfish anymore. The revolution in computing is both having the self-learning software framework and the hardware to meaningfully run it.

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo