What is different is that alphazero's evaluation selects features of eval by itself (via a nerual network), while in the standard approach the programmer select features (e.g. passsed pawns, king safety, rook on open file etc) and just tunes the weights. The downside of the neural-network approach is that you may not understand why it does what it does.
Daniel
AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Moderators: hgm, Rebel, chrisw
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
-
- Posts: 2129
- Joined: Thu May 29, 2008 10:43 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
-
- Posts: 2129
- Joined: Thu May 29, 2008 10:43 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Code: Select all
Program Chess Shogi Go
AlphaZero 80k 40k 16k
Stockfish 70,000k
Elmo 35,000k
shogi and Go.
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
-
- Posts: 2129
- Joined: Thu May 29, 2008 10:43 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Milos wrote:It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...
-
- Posts: 4190
- Joined: Wed Nov 25, 2009 1:47 am
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Alpha0 iz basically behaving like huge highly selective opening book.kranium wrote:AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...
However, beside hardware other stuff are highly questionable in this work.
I guess ppl are a bit intimidated to ask question because it is Google, but many things are fishy and unfavourable to SF.
One big disadvantage was TC, 1min/move means SF spent only 1 minute for each of the opening moves while in normal TC like 40/40 it would spend easily 5-10 minutes per each of opening moves. That made it much weaker 20 maybe even 30Elo since most of loses for SF already happen in the opening.
Second is no-book play, where Alpha0 mainly forces openings and lines that it spent most of the time training and SF had no help from book whatsever, so in this case to make it at least a bit more fair one should use strong book such as Cerebellum as a support to SF.
Starting from 12 typical human openings (only 4 moves deep at max), the gap Alpha0 had over SF reduced from 100 to 77Elo which can be seen from the paper.
Third even though they used last year TCEC winner, SF8 has untested behaviour on 64 cores, and on that hardware is at least 30 if not more Elo weaker than the current SFdev.
So taking all into consideration it is pretty safe to assume that latest Brainfish at normal TC like 40/40 would be at list on par if not stronger than Alpha0. And all that on much weaker hardware.
If they really wanted to make fair comparison instead of running Alpha0 on regular x64 one could also run SF on custom hardware where all the evaluation is handled with fully custom implemented FPGAs (like DeepBlue did) and then one would see how much weaker Alpha0 really is, when comparison is not apples and oranges.
Last edited by Milos on Wed Dec 06, 2017 4:36 pm, edited 1 time in total.
-
- Posts: 1010
- Joined: Thu Sep 01, 2011 2:49 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
Maybe this thread should be merged to:
http://www.talkchess.com/forum/viewtopi ... w=&start=0
As the topic is the same.
http://www.talkchess.com/forum/viewtopi ... w=&start=0
As the topic is the same.
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
The hardware advantage is 16/1, so simply ridiculous.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
I could not accept such a test in any way.
But then, what have Google done right?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
This makes no sense at all.kranium wrote:Table S4: Evaluation speed (positions/second) of AlphaZero, Stockfish, and Elmo in chess,Code: Select all
Program Chess Shogi Go AlphaZero 80k 40k 16k Stockfish 70,000k Elmo 35,000k
shogi and Go.
Alpha 1000 slower?
This would mean 1000 more evaluation terms in its code!
What kind of terms?
-
- Posts: 6052
- Joined: Tue Jun 12, 2012 12:41 pm
Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo
There is no deep eval, eval is static.kranium wrote:Milos wrote:It actually is, instead of 4TPUs required to run Alpha0 so far, on x64 hardware one would need around 2000 Haswell cores to achieve the same speed of NN (80k patterns evaluated per second). Since NNs are huge, with smaller resources matrix multiplication would have to be broken into smaller sub-matrices which would exponentially slow down the calculation.kranium wrote:As Daniel explains: no hard coded evaluation (software)...it's game play is based on learning (experience) from previous self-play games applied to a neural networkLyudmil Tsvetkov wrote:- Alpha had considerable hardware advantage
- SF played with version 8
- what was the code/software/evaluation base used for the first Alpha chess version, an advanced engine evaluation and search software or otherwise?
5,000 first-generation TPUs to generate self-play games
and 64 second-generation TPUs to train the neural networks
The hardware advantage is not such an important factor during gameplay as one would imagine.
AlphaZero very selectively evaluating 80k vs Stockfish's 70,000k positions/sec, probably achieving tremendous depths at such speeds,
but I'd guess it's the deep (learned) positional eval which is primarily adding strength...
How could an entity have 1000 times more terms than SF?