In chess,AlphaZero outperformed Stockfish after just 4 hours

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

From the document - In chess, AlphaZero outperformed Stockfish after just 4 hours. How believable is that?

I believe it as written
37
54%
I am sceptic
21
30%
I don't (can't) believe it
8
12%
I am undecided
3
4%
 
Total votes: 69

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

In chess,AlphaZero outperformed Stockfish after just 4 hours

Post by Rebel »

I don't care that SF lost, it's totally irrelevant in the light of the huge claim by the Deepmind company, the alleged 4 hours self-play, quoting the document again: without any additional domain knowledge except the rules of the game.

Have you already let it sink in what is stated here?

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?

And so I voted for option 3.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Milos »

Rebel wrote:I don't care that SF lost, it's totally irrelevant in the light of the huge claim by the Deepmind company, the alleged 4 hours self-play, quoting the document again: without any additional domain knowledge except the rules of the game.

Have you already let it sink in what is stated here?

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?

And so I voted for option 3.
Ok, I repeat it again, because you didn't seem to have read it.
4 hours training time is another PR BS from Google.
4h on 5000TPUs where each TPU is equivalent to roughly 2 new GV100 or 10 1080Ti which is currently the top of the range graphics card normal individuals can afford. So those 4h of training time is like over 30 years of training on 1080Ti. I really wonder why they only used 5000 TPUs, they could have used 50000 instead and claim 24minutes of training time only. Or Google actually only owns 5000TPUs in total and they used all available TPU resources of the 500 billion dollar company for training a chess engine?
My guess is they actually didn't use 5000TPUs but much less and used much more time, but just extrapolated training time to what would be in case they used all of their existing TPUs just to make an impressive statement, and they obviously succeeded judging by the wrong impression you got from it.
Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Michael Sherwin »

There are so many contradictory claims and contradictions in the pre release paper. No domain specific knowledge, no human games database, only self play. And yet they trained against the most common positions in a human database and they did it 100 games per position and they did it against SF. And we are supposed to believe it was after the main 100 game match. For this press release why mention the training games supposedly after the main match. What purpose does it serve except to cloud the issue and cast doubt. In RomiChess over the years it was demonstrated many times that Romi could win 100 game matches against the top engines when each game started from the same position. And here they are doing the exact same thing. But of course that was after the match reported when it served no purpose as far as the reported match. It just looks fishy! :lol:

And MC or no MC or MC on the training and no MC on the playing algorithm. Were the games of self play MC? If so how many simulations did they run per move? Were the MC simulations really random or did the NN decide which move to play. If it was the NN then it was not MC because MC by definition is random. Their are so many variables unspecified we have no idea what the hell they did.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
User avatar
Ozymandias
Posts: 1532
Joined: Sun Oct 25, 2009 2:30 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Ozymandias »

The training phase... didn't it consist of 44 million games or something like that? If that's the case, I don't see how they could be played in just four hours.
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Henk »

Current Skipper implementation needs about two seconds to play a random game. Much work to do. Also a forwardpass through a Neural Network not that cheap. Difficult to optimize that.

Evaluation used to be bottleneck. Now it its the neural network.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Vinvin »

Rebel wrote:...
How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?
...
Sure firsts games are pretty random then come good results with good setup then keeping this setup and then finding even better setup because of better results and loop this a lot of times ...
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Rebel »

Milos wrote:
Rebel wrote:I don't care that SF lost, it's totally irrelevant in the light of the huge claim by the Deepmind company, the alleged 4 hours self-play, quoting the document again: without any additional domain knowledge except the rules of the game.

Have you already let it sink in what is stated here?

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?

And so I voted for option 3.
Ok, I repeat it again, because you didn't seem to have read it.
4 hours training time is another PR BS from Google.
4h on 5000TPUs where each TPU is equivalent to roughly 2 new GV100 or 10 1080Ti which is currently the top of the range graphics card normal individuals can afford. So those 4h of training time is like over 30 years of training on 1080Ti. I really wonder why they only used 5000 TPUs, they could have used 50000 instead and claim 24minutes of training time only. Or Google actually only owns 5000TPUs in total and they used all available TPU resources of the 500 billion dollar company for training a chess engine?
My guess is they actually didn't use 5000TPUs but much less and used much more time, but just extrapolated training time to what would be in case they used all of their existing TPUs just to make an impressive statement, and they obviously succeeded judging by the wrong impression you got from it.
It's not about that, it's about the self-play from scratch:

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?


Not even a million years could do the job.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Rebel »

Vinvin wrote:
Rebel wrote:...
How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?
...
Sure firsts games are pretty random then come good results with good setup then keeping this setup and then finding even better setup because of better results and loop this a lot of times ...
Hint, have you considered why there is no 8-man TB yet?
Henk
Posts: 7216
Joined: Mon May 27, 2013 10:31 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Henk »

Probably when doing many simulations end game positions are evaluated right.
So in self play player wins that does evaluate end game best. And if end game is evaluated right then next stage is end of end game and so forth until opening stage is reached.

Each time neural network is improving so games get less random.
Last edited by Henk on Mon Dec 18, 2017 1:02 pm, edited 1 time in total.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Ovyron »

Rebel wrote:No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?
Yup, I think true Artificial Intelligence has finally arrived, and it can do things like this and others that I would have never imagined to be possible.

Some examples of similar AIs:

AI can extract the style of a photo and turn another photo into that style
AI can learn how to make paintings of any artist of history and use any image to show how that artist would have painted it.
AI takes text as input and creates new photo realistic images indistingishable from actual photos.
AI learns how humans lips move when talking, so it can sync a video of anybody to any audio talking.
AI learns how celebrities look like and can invent new faces for fake ones that look real.
AI learns how art looks like, so it can turn your doodles into works of art.
AI learns how video works, so it can predict the future and create videos from still images
AI learns how images become pixelated when you scale them down and manages to reverse the process, turning pixelated messes into High Resolution images.
AI learns how visual expressions work and can swap the expressions of two people.
AI can turn your sketches into photo realistic images.
AI learns how to play non-deterministic video games just like humans.

And a lot more things...

Frankly, I find some of these much more impressive than a chess engine with 3500 elo, that we knew was eventually coming.

Coming from the AI field I can say I find nothing strange about AI learning things from scratch, you just teach it what can it can do (say, the rules of chess) and the output you want (say, winning the game), and the AI learns a way about how to do it.

I expect that soon you can invent new games and teach your AI the rules, and soon get the 3500 elo equivalent of that game in ELO. I handn't seen A0 lose a single chess game so who knows if it already plays perfectly.

We're still early on this, though. In the future we might have AIs able to learn to write books with useful info, or write the next part of a book in a trilogy writen by a human. What about an AI that can generate movies? You can feed it all the Disney Classics before Toy Story, and it could output a brand new classic that some other AI can't tell apart from the originals.

Down the road, this chess playing AI will look like peanuts.
Last edited by Ovyron on Mon Dec 18, 2017 1:06 pm, edited 1 time in total.