recent article on alphazero ... 12/11/2017 ...

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

hgm wrote:No adjustment of the N was done during the match.
...
It is all clearly described in the paper.
Can you or Ronald please show the lines of the paper in which is said, the NN wasn't adjusted anymore through the games?

I have found only this till now:
We evaluated the fully trained instances of AlphaZero against Stockfish, Elmo and the previous
version of AlphaGo Zero (trained for 3 days) in chess, shogi and Go respectively, playing
100 game matches at tournament time controls of one minute per move. AlphaZero and the
previous AlphaGo Zero used a single machine with 4 TPUs. Stockfish and Elmo played at their

(here is the performance- table within the text)

strongest skill level using 64 threads and a hash size of 1GB. AlphaZero convincingly defeated
all opponents, losing zero games to Stockfish and eight games to Elmo (see Supplementary Material
for several example games), as well as defeating the previous version of AlphaGo Zero.
Peter.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by duncan »

http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

Hi!
duncan wrote:http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
Demis Hassabis could be tried to be contacted over his hp maybe too:
http://demishassabis.com/
There's a contact- adress given
Peter.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by duncan »

peter wrote:Hi!
duncan wrote:http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
Demis Hassabis could be tried to be contacted over his hp maybe too:
http://demishassabis.com/
There's a contact- adress given
thanks I sent off an email.


https://www.theregister.co.uk/2017/12/1 ... ai_unfair/
the work is being submitted for peer review and unfortunately we cannot say any more at this time.”
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: recent article on alphazero ... 12/11/2017 ...

Post by Eelco de Groot »

peter wrote:Hi!
duncan wrote:http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
Demis Hassabis could be tried to be contacted over his hp maybe too:
http://demishassabis.com/
There's a contact- adress given
If you contact him, please ask him on behalf of all our members if the full set of games will be made available. At least some more than just ten... Because I don't think we will see much more of Alpha Zero at least not in this form. At least with Deep Blue there were the Deep Thought games. I was going to ask at some time but there is no point all of us asking the same thing.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
peter
Posts: 3185
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

duncan wrote:
peter wrote:Hi!
duncan wrote:http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
Demis Hassabis could be tried to be contacted over his hp maybe too:
http://demishassabis.com/
There's a contact- adress given
thanks I sent off an email.


https://www.theregister.co.uk/2017/12/1 ... ai_unfair/
the work is being submitted for peer review and unfortunately we cannot say any more at this time.”
Thank you too!
For me the question, if A0 was learning in the games still or not, is crucial.

If it really didn't, the few opening lines don't matter as much, they still would matter, but not as much.
If it did learn from the games played during the match yet, that's what would make the result one without any meaning to me.

Maybe Harm Geert and Ronald are right, and
"We evaluated the fully trained instances of AlphaZero against Stockfish"
could be interpreted such, that there wasn't any change of the NN during the games anymore, but this one sentence only doesn't mean that for sure to me at all.

And yes, I support the question for the rest of the games being made public again too, that Eelco wrote about in his posting below just now also
Peter.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by duncan »

Eelco de Groot wrote:
peter wrote:Hi!
duncan wrote:http://www0.cs.ucl.ac.uk/staff/d.silver/web/Home.html

does anyone know who is authorised to speak for alphazero if you want to email on the slim chance that they will answer ?
Demis Hassabis could be tried to be contacted over his hp maybe too:
http://demishassabis.com/
There's a contact- adress given
If you contact him, please ask him on behalf of all our members if the full set of games will be made available. At least some more than just ten... Because I don't think we will see much more of Alpha Zero at least not in this form. At least with Deep Blue there were the Deep Thought games. I was going to ask at some time but there is no point all of us asking the same thing.
I just asked him (as an individual) if he can give any type of timescale when it was going to be published and peer reviewed as it seems he does not want to give out any information until then .

I cannot and am not qualified to speak on behalf of talkchess. if you feel there is one all important question that talkchess members 'need' to know (maybe name drop lai) or reasonable and easy request then best email him.
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by duncan »

Eelco de Groot wrote: If you contact him, please ask him on behalf of all our members if the full set of games will be made available. At least some more than just ten... Because I don't think we will see much more of Alpha Zero at least not in this form.
looks like a reasonable request and no reason (that I can think of) why he should not want to. best try to email him.
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: recent article on alphazero ... 12/11/2017 ...

Post by Eelco de Groot »

Okay, thanks Duncan! Maybe he will read it here, or we will learn more from the final publication. To them the games are not really relevant, they are just an illustration but I can't imagine I'm the only one who would like to know. It all adds to some sort of mythforming, which is okay for their PR but not for science :)
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by syzygy »

peter wrote:
hgm wrote:No adjustment of the N was done during the match.
...
It is all clearly described in the paper.
Can you or Ronald please show the lines of the paper in which is said, the NN wasn't adjusted anymore through the games?

I have found only this till now:
We evaluated the fully trained instances of AlphaZero against Stockfish, Elmo and the previous version of AlphaGo Zero (trained for 3 days) in chess, shogi and Go respectively, ...
"fully trained". Adjusting the NN is called "training".

There are many other clear statements:
Starting from random play, and given no domain knowledge except the game rules, AlphaZero achieved within 24 hours a superhuman level of play in the games of chess and shogi (Japanese chess) as well as Go, and convincingly defeated a world-champion program in each case.
Recently, the AlhpaGo Zero algorithm achieved superhuman performance in the game of Go, by representing Go knowledge using deep convolutional neural networks (22, 28), trained solely by reinforcement learning from games of self-play (29). In this paper, we apply a similar but fully generic algorithm, which we call AlphaZero, ...
AlphaZero learns these move probabilities and value estimates entirely from self-play; these are then used to guide its search.
The parameters θ of the deep neural network in AlphaZero are trained by self-play reinforcement learning, starting from randomly initialised parameters θ.
We trained a separate instance of AlphaZero for each game. Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly initialised parameters, using 5,000 first-generation TPUs (15) to generate self-play games and 64 second-generation TPUs to train the neural networks.
The games against SF were played by the (fully trained) AlphaZero on a machine with 4 TPUs. A completely different setup than that used for adjusting the neural network parameters.
During training, each MCTS used 800 simulations.
When playing SF8 for 100 games, the MCTS performed 80,000 simulations per second during 1 minute for each move.