Maybe this thread should be merged to:
http://www.talkchess.com/forum/viewtopi ... w=&start=0
As the topic is the same.
AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Moderators: hgm, Rebel, chrisw
-
- Posts: 1010
- Joined: Thu Sep 01, 2011 2:49 pm
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
The hardware advantage is 16/1, so simply ridiculous.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
I could not accept such a test in any way.
But then, what have Google done right?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
This makes no sense at all.kranium wrote:Table S4: Evaluation speed (positions/second) of AlphaZero, Stockfish, and Elmo in chess,Code: Select all
Program Chess Shogi Go AlphaZero 80k 40k 16k Stockfish 70,000k Elmo 35,000k
shogi and Go.
Alpha 1000 slower?
This would mean 1000 more evaluation terms in its code!
What kind of terms?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
There is no deep eval, eval is static.kranium wrote:Milos wrote:It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...
How could an entity have 1000 times more terms than SF?
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Just read the paper, not only 1000 times more, much more than that, there are 4,672 planes just for possible pieces/move to/side to move.Lyudmil Tsvetkov wrote:This makes no sense at all.kranium wrote:Table S4: Evaluation speed (positions/second) of AlphaZero, Stockfish, and Elmo in chess,Code: Select all
Program Chess Shogi Go AlphaZero 80k 40k 16k Stockfish 70,000k Elmo 35,000k
shogi and Go.
Alpha 1000 slower?
This would mean 1000 more evaluation terms in its code!
What kind of terms?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Tell me the precise code, that says nothing to me.Daniel Shawul wrote:What is different is that alphazero's evaluation selects features of eval by itself (via a nerual network), while in the standard approach the programmer select features (e.g. passsed pawns, king safety, rook on open file etc) and just tunes the weights. The downside of the neural-network approach is that you may not understand why it does what it does.
Daniel
How does it select features, based on what?
Playing 100 000 games, many wins with pawn on d4 or e5, so this is good, or interpreting 100 000 games from some large himan database, e5 pawn is more common in winnig games then d5 pawn, so increase its value.
But that has its limits.
What about mobility, how they figure out mobility from human games?
More importantly, what would an evaluation pattern consist of?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
I browsed over it, but did not pay attention to it.Milos wrote:Just read the paper, not only 1000 times more, much more than that, there are 4,672 planes just for possible pieces/move to/side to move.Lyudmil Tsvetkov wrote:This makes no sense at all.kranium wrote:Table S4: Evaluation speed (positions/second) of AlphaZero, Stockfish, and Elmo in chess,Code: Select all
Program Chess Shogi Go AlphaZero 80k 40k 16k Stockfish 70,000k Elmo 35,000k
shogi and Go.
Alpha 1000 slower?
This would mean 1000 more evaluation terms in its code!
What kind of terms?
So, evaluation will depend not only where the piece lands, but also where it comes from.
That is ridiculous, that is already not static evaluation, but indeed, working like some kind of a very sophisticated book.
-
- Posts: 1010
- Joined: Thu Sep 01, 2011 2:49 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Come on Lyudmil, read the paper. AlphaZero don't use eval functions.Lyudmil Tsvetkov wrote:There is no deep eval, eval is static.kranium wrote:Milos wrote:It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...
How could an entity have 1000 times more terms than SF?
It probably don't know about a Rook being 5.0 or a pawn being 1.0 just like most engines.
It is using MCTS, and doing a lot of probability checks with 1-0, 0.5, 0-1 at the end of the
search Then choosing that path/route which has more wins or draws and fewest losses.
edit-
least=fewest
Last edited by MikeGL on Wed Dec 06, 2017 5:27 pm, edited 1 time in total.
-
- Posts: 546
- Joined: Sat Aug 17, 2013 12:36 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
I lol'd.Tell me the precise code
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
We might as well compete which will get longer.MikeGL wrote:Maybe this thread should be merged to:
http://www.talkchess.com/forum/viewtopi ... w=&start=0
As the topic is the same.