Historic Milestone: AlphaZero

Milos · Post by **Milos** » Mon Dec 11, 2017 3:35 pm

hgm wrote:It just took the rules, and after 4 hours of thinking about them it reached a conclusion of how to best play the game decribed by those rules, that was good enough to perform at 3000+ Elo level.

It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
What we don't know is what were the starting positions of those 20 million self-played games.
Dirichlet noise and temperature=1 is far from enough for good exploration of opening games, so I strongly doubt they only used starting positions for those self-played games (but ofc we will never get a confirmation or proof of anything from Google).

Milos · Post by **Milos** » Mon Dec 11, 2017 3:38 pm

mclane wrote:That’s not the point.
AZ plays more human like stockfish etc.

Although these programs were made by men.
The grandmasters and IMs and Kiebitzes arround world are all compeletY happy to see steinitz, nimzowitsch and Tal being executed against stockfish.
While stockfish stands there paralyzed and with a bunch of not developed pieces en block , not able to move a piece, not capable not to eat the sacced piece or pawn.

Stockfish plays machine chess .
While AZ plays human chess.

But AZ is the machines machine. While stockfish is the human machine design.

That’s really paradox.

Well, you are human, but you make comments like a bot (machine). For me that is the real paradox.

mclane · Post by **mclane** » Mon Dec 11, 2017 3:47 pm

Try to concentrate on content.

Milos · Post by **Milos** » Mon Dec 11, 2017 4:03 pm

mclane wrote:Try to concentrate on content.

Sorry, but you offer none, nothing to concentrate on, just similar kind of rumblings like Anil. You clearly don't understand how Alpha0 works, how it is trained, and why does it play like it plays, and just behave like primitive humans that attributed godlike characteristics to any phenomenon they couldn't understand.

mclane · Post by **mclane** » Mon Dec 11, 2017 4:16 pm

Milos wrote:
mclane wrote:Try to concentrate on content.
Sorry, but you offer none, nothing to concentrate on, just similar kind of rumblings like Anil. You clearly don't understand how Alpha0 works, how it is trained, and why does it play like it plays, and just behave like primitive humans that attributed godlike characteristics to any phenomenon they couldn't understand.

As far as I see it, the humans all over the world do understand much better how AZ won then the stockfish community.

hgm · Post by **hgm** » Mon Dec 11, 2017 4:16 pm

Milos wrote:It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.

That counts as thinking (i.e. information processing without any exchange of information with the outside world). It is up to the machine to decide what it thinks about.

Henk · Post by **Henk** » Mon Dec 11, 2017 4:47 pm

No wonder that A0 plays more human like for neural network supposed to be a model of human brain

Milos · Post by **Milos** » Mon Dec 11, 2017 4:55 pm

mclane wrote:
Milos wrote:
mclane wrote:Try to concentrate on content.
Sorry, but you offer none, nothing to concentrate on, just similar kind of rumblings like Anil. You clearly don't understand how Alpha0 works, how it is trained, and why does it play like it plays, and just behave like primitive humans that attributed godlike characteristics to any phenomenon they couldn't understand.
As far as I see it, the humans all over the world do understand much better how AZ won then the stockfish community.

Yea sure, humans without specific knowledge of DCNN, MCTS, etc, that are 700-1500Elo weaker than Alpha0 (whose evaluation is blackbox that no one could actually understand) understand it better than chess programmers. Keep dreaming, if it helps your ego, to think you are smarter because you think you can understand why machine played something when in the same time you have no clue how machine works, so be it...

Milos · Post by **Milos** » Mon Dec 11, 2017 4:59 pm

hgm wrote:
Milos wrote:It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
That counts as thinking (i.e. information processing without any exchange of information with the outside world). It is up to the machine to decide what it thinks about.

Machine doesn't think. It follows the algorithm set by humans. In this case reinforcement learning, based on reward coming as output of self-played games.
Based on this output from the games, humans, again, train the machine using algorithms.
Finally, machine plays the games, following (almost 100%) deterministic algorithm that includes blackbox evaluation that it "learned".

Albert Silver · Post by **Albert Silver** » Mon Dec 11, 2017 6:16 pm

Milos wrote:
hgm wrote:It just took the rules, and after 4 hours of thinking about them it reached a conclusion of how to best play the game decribed by those rules, that was good enough to perform at 3000+ Elo level.
It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
What we don't know is what were the starting positions of those 20 million self-played games.
Dirichlet noise and temperature=1 is far from enough for good exploration of opening games, so I strongly doubt they only used starting positions for those self-played games (but ofc we will never get a confirmation or proof of anything from Google).

Did it really only play 20 million games? I'm amazed if that is true. I would have thought it would take a lot more. Or was that meant as an illustrative point for the argument?

Historic Milestone: AlphaZero

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish