In chess,AlphaZero outperformed Stockfish after just 4 hours

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

From the document - In chess, AlphaZero outperformed Stockfish after just 4 hours. How believable is that?

I believe it as written
37
54%
I am sceptic
21
30%
I don't (can't) believe it
8
12%
I am undecided
3
4%
 
Total votes: 69

Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Henk »

Is there already an AI program generating pop music. For instance did it create a nr 1 hit.

Lately I found a video where someone was trying to make a neural network play bach music. But task wasn't that easy. Sometimes it played fantastic but most of the time horrible. And before that stage was reached it sounded jazzy.
Last edited by Henk on Mon Dec 18, 2017 1:11 pm, edited 2 times in total.
User avatar
Ovyron
Posts: 4556
Joined: Tue Jul 03, 2007 4:30 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Ovyron »

Henk wrote:Is there already an AI program generating pop music. For instance did it create a nr 1 hit.
Not yet, but unless you're familiar with Bach's work, there's an AI that generates new music in the style of Bach, that you may not be able to tell between a real piece and a fake one.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by kranium »

Rebel wrote:I don't care that SF lost, it's totally irrelevant in the light of the huge claim by the Deepmind company, the alleged 4 hours self-play, quoting the document again: without any additional domain knowledge except the rules of the game.

Have you already let it sink in what is stated here?

No mobility, no king safety, no passed pawn evaluation, no castling knowledge, not even piece values?

How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?

And so I voted for option 3.
Monte Carlo search does not use a tradition eval as we know it, so mobility, king safety etc. are irrelevant.

It uses a struct to hold info likes wins, losses, draws, win %, etc.,
then simply references accumulated data for the current position to select the move with the highest probability of winning.

Ivanhoe has a Montecarlo search implementation (with which I'm fairly familiar) and it works quite well.

The default implementation uses a sort of 'searchmoves' algorithm:
go montecarlo cpus 8 min -25 max 325 length 40 depth 10 moves c2c4 d2d4 e2e4 g1f3

Years ago I experimented with a version that would obtain the root move list from current position and actually play a strong game.
If you send it all 20 possible moves from the traditional start position, you'd be amazed how quickly the potential move choices are narrowed down...and it usually plays 1. c4 or 1. e4
I still have it if anyone interested (but it does crash once in awhile).
Last edited by kranium on Mon Dec 18, 2017 1:12 pm, edited 2 times in total.
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Henk »

Ok then somebody else has already succeeded.
kranium
Posts: 2129
Joined: Thu May 29, 2008 10:43 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by kranium »

Ozymandias wrote:The training phase... didn't it consist of 44 million games or something like that? If that's the case, I don't see how they could be played in just four hours.
Like MIlos said:

"4h on 5000TPUs where each TPU is equivalent to roughly 2 new GV100 or 10 1080Ti which is currently the top of the range graphics card normal individuals can afford. So those 4h of training time is like over 30 years of training on 1080Ti."

This is an enormous resource...self-play usually involves lightning games, sometimes as fast as 1 sec + 1 ms inc.
Just do the math and one can see see how it's possible.
Henk
Posts: 7218
Joined: Mon May 27, 2013 10:31 am

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Henk »

One main problem is getting enough training examples. If you need a great many training examples but you can't create them automatically then it almost gets impossible to solve the problem.
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Vinvin »

Rebel wrote:
Vinvin wrote:
Rebel wrote:...
How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?
...
Sure firsts games are pretty random then come good results with good setup then keeping this setup and then finding even better setup because of better results and loop this a lot of times ...
Hint, have you considered why there is no 8-man TB yet?
If I understand well what you mean : "Because they are too big ?"
TBs store exact values but Neural Networks store "shapes" (they sometimes work great and sometimes work badly).
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by yurikvelo »

Please clarify on A0.

Can it analyze arbitrary FEN position or it's learn-tree is based on games of strong engines?
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Rebel »

User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: In chess,AlphaZero outperformed Stockfish after just 4 h

Post by Rebel »

Vinvin wrote:
Rebel wrote:
Vinvin wrote:
Rebel wrote:...
How would that first self-play game look like? Something 1.a3 a6 2.a4 a5 3. b3 b6 etc and how would that lead to anything for the second self-pay game?
...
Sure firsts games are pretty random then come good results with good setup then keeping this setup and then finding even better setup because of better results and loop this a lot of times ...
Hint, have you considered why there is no 8-man TB yet?
If I understand well what you mean : "Because they are too big ?"
TBs store exact values but Neural Networks store "shapes" (they sometimes work great and sometimes work badly).
Yep, size.

https://en.wikipedia.org/wiki/Shannon_number

https://en.wikipedia.org/wiki/Solving_chess