recent article on alphazero ... 12/11/2017 ...

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by syzygy »

peter wrote:The point is, that the question remains, if the big "learning success" has happened in anything else then this, to succeed against the crippled opponent learning to beat it in its crippled openings only.
How can you seriously call SF8 at 1 minute/move on 64 threads crippled?

Things would be very different if AlphaZero had been trained just to beat SF8 in that particular configuration. But there is no evidence of that whatsoever. AlphaZero was trained by self-play for four hours, had never been shown a move played by Stockfish, then beat Stockfish convincingly.
peter
Posts: 3186
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

syzygy wrote:The supposed "crippling" is totally irrelevant if you care about AlphaZero going from nothing to >2500 all by itself. And this is where the scientific breakthrough lies (well, perhaps not a "breakthrough" after doing the same in Go).
Do you really think, Google was to show, A0 had gone to 2501 "Elo" only?
That would mean it had to fight against an 2500 Elo- player about the edge, would't it?

It wouldn't be that "breakthrough" that it is meant to be shown in the "paper" and in all fora now and that it probably is in Go.

I wouldn't care about Celo at all if it wouldn't be the measurement to be used normally since years and years in computerchess, at least from all the people who now all at sudden say, they don't count in that match.
:)


A0 was tested against SF, TC was bogush and as long as we haven't at least seen all 100 games played, there could be well the suspect only 3 opening lines were to be learned from A0 in the match to beat SF in these few on and on remains
Peter.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by syzygy »

peter wrote:
syzygy wrote:The supposed "crippling" is totally irrelevant if you care about AlphaZero going from nothing to >2500 all by itself. And this is where the scientific breakthrough lies (well, perhaps not a "breakthrough" after doing the same in Go).
Do you really think, Google was to show, A0 had gone to 2501 "Elo" only?
It would still have been a revolution, since all existing engines playing at GM level are based on alpha-beta search and a relatively simple evaluation function.

But it would not have annoyed you and so many others nearly as much as what they have shown now.
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: recent article on alphazero ... 12/11/2017 ...

Post by hgm »

I guess it is all the fault of the Stockfish team: they should have attached a more clear disclaimer to it. Something like:
*** WARNING ***
This program is only intended for participating in the TCEC tournament. Any alteration of the parameters, even if technically supported, will totally crapple it. The authors cannot be held responsible for the abominable play that will result at other than TCEC conditions. This in particular holds for any alterations in time control, opening book or end-game tables.
Then perhaps the AlphaZero team would have thought twice before using Stockfish as an opponent, and would have selected a more robust one.
Joerg Oster
Posts: 937
Joined: Fri Mar 10, 2006 4:29 pm
Location: Germany

Re: recent article on alphazero ... 12/11/2017 ...

Post by Joerg Oster »

hgm wrote:I guess it is all the fault of the Stockfish team: they should have attached a more clear disclaimer to it. Something like:
*** WARNING ***
This program is only intended for participating in the TCEC tournament. Any alteration of the parameters, even if technically supported, will totally crapple it. The authors cannot be held responsible for the abominable play that will result at other than TCEC conditions. This in particular holds for any alterations in time control, opening book or end-game tables.
Then perhaps the AlphaZero team would have thought twice before using Stockfish as an opponent, and would have selected a more robust one.
:lol: :lol: :lol:
Jörg Oster
peter
Posts: 3186
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

syzygy wrote: Things would be very different if AlphaZero had been trained just to beat SF8 in that particular configuration. But there is no evidence of that whatsoever. AlphaZero was trained by self-play for four hours, had never been shown a move played by Stockfish, then beat Stockfish convincingly.
At least it had 100 games to learn to play SF in that somewhat crippled TC (crippling the opening repertoire of the engine even more) in 100 games and probably only very few different opening lines.

Out of these 100 in 10 games with 3 different opening lines only, maybe in 100 with 4 or 5, who knows?
Google knows, why not show all the 100 games at least
:?:
Peter.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by syzygy »

peter wrote:
syzygy wrote:Things would be very different if AlphaZero had been trained just to beat SF8 in that particular configuration. But there is no evidence of that whatsoever. AlphaZero was trained by self-play for four hours, had never been shown a move played by Stockfish, then beat Stockfish convincingly.
At least it had 100 games to learn to play SF in that somewhat crippled TC (crippling the opening repertoire of the engine even more) in 100 games and probably only very few different opening lines.
No, it was not "learning" when playing SF.
peter
Posts: 3186
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: recent article on alphazero ... 12/11/2017 ...

Post by peter »

syzygy wrote:
peter wrote:
syzygy wrote:Things would be very different if AlphaZero had been trained just to beat SF8 in that particular configuration. But there is no evidence of that whatsoever. AlphaZero was trained by self-play for four hours, had never been shown a move played by Stockfish, then beat Stockfish convincingly.
At least it had 100 games to learn to play SF in that somewhat crippled TC (crippling the opening repertoire of the engine even more) in 100 games and probably only very few different opening lines.
No, it was not "learning" when playing SF.
Where is this to be read?
And how do you turn off learning of a machine that gets it's moves from MCTS?
I mean SF's hash will have been deleted between the games, what will be the matching part in A0's computing?
Why do you think there was anything deleted by A0 already having been learned or computed between the games
:?:
Peter.
syzygy
Posts: 5563
Joined: Tue Feb 28, 2012 11:56 pm

Re: recent article on alphazero ... 12/11/2017 ...

Post by syzygy »

peter wrote:
syzygy wrote:
peter wrote:
syzygy wrote:Things would be very different if AlphaZero had been trained just to beat SF8 in that particular configuration. But there is no evidence of that whatsoever. AlphaZero was trained by self-play for four hours, had never been shown a move played by Stockfish, then beat Stockfish convincingly.
At least it had 100 games to learn to play SF in that somewhat crippled TC (crippling the opening repertoire of the engine even more) in 100 games and probably only very few different opening lines.
No, it was not "learning" when playing SF.
Where is this to be read?
And how do you turn off learning of a machine that gets it's moves from MCTS?
In the papers. If you have not read them, why are we discussing?
User avatar
hgm
Posts: 27790
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: recent article on alphazero ... 12/11/2017 ...

Post by hgm »

MCTS does not automatically learn. To learn the NN should be altered, and this was done in the training phase by the 24 generation-2 TPUs, after they received the training games played with the previous setting from the 5000 generation-1 TPUs.

No adjustment of the N was done during the match. Just playing on 4 gen-2 TPUs. The MCTS process had no memory other than keeping the sub-tree for the two played moves (its own ad the opponen'tt response) in the tree of the previous move, and use that as starting point for the next move. This is similar to keeping the TT hash from one move to the next in a convetional engine, rather than clearing the hash for every move. For every new game there was nothing to start with, as the final move of the previous game would not contain the iitial position as a node.

It is all clearly described in the paper.