Historic Milestone: AlphaZero

mclane · Post by **mclane** » Mon Dec 11, 2017 4:16 pm

Milos wrote:
mclane wrote:Try to concentrate on content.
Sorry, but you offer none, nothing to concentrate on, just similar kind of rumblings like Anil. You clearly don't understand how Alpha0 works, how it is trained, and why does it play like it plays, and just behave like primitive humans that attributed godlike characteristics to any phenomenon they couldn't understand.

As far as I see it, the humans all over the world do understand much better how AZ won then the stockfish community.

hgm · Post by **hgm** » Mon Dec 11, 2017 4:16 pm

Milos wrote:It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.

That counts as thinking (i.e. information processing without any exchange of information with the outside world). It is up to the machine to decide what it thinks about.

Henk · Post by **Henk** » Mon Dec 11, 2017 4:47 pm

No wonder that A0 plays more human like for neural network supposed to be a model of human brain

Milos · Post by **Milos** » Mon Dec 11, 2017 4:55 pm

mclane wrote:
Milos wrote:
mclane wrote:Try to concentrate on content.
Sorry, but you offer none, nothing to concentrate on, just similar kind of rumblings like Anil. You clearly don't understand how Alpha0 works, how it is trained, and why does it play like it plays, and just behave like primitive humans that attributed godlike characteristics to any phenomenon they couldn't understand.
As far as I see it, the humans all over the world do understand much better how AZ won then the stockfish community.

Yea sure, humans without specific knowledge of DCNN, MCTS, etc, that are 700-1500Elo weaker than Alpha0 (whose evaluation is blackbox that no one could actually understand) understand it better than chess programmers. Keep dreaming, if it helps your ego, to think you are smarter because you think you can understand why machine played something when in the same time you have no clue how machine works, so be it...

Milos · Post by **Milos** » Mon Dec 11, 2017 4:59 pm

hgm wrote:
Milos wrote:It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
That counts as thinking (i.e. information processing without any exchange of information with the outside world). It is up to the machine to decide what it thinks about.

Machine doesn't think. It follows the algorithm set by humans. In this case reinforcement learning, based on reward coming as output of self-played games.
Based on this output from the games, humans, again, train the machine using algorithms.
Finally, machine plays the games, following (almost 100%) deterministic algorithm that includes blackbox evaluation that it "learned".

Albert Silver · Post by **Albert Silver** » Mon Dec 11, 2017 6:16 pm

Milos wrote:
hgm wrote:It just took the rules, and after 4 hours of thinking about them it reached a conclusion of how to best play the game decribed by those rules, that was good enough to perform at 3000+ Elo level.
It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
What we don't know is what were the starting positions of those 20 million self-played games.
Dirichlet noise and temperature=1 is far from enough for good exploration of opening games, so I strongly doubt they only used starting positions for those self-played games (but ofc we will never get a confirmation or proof of anything from Google).

Did it really only play 20 million games? I'm amazed if that is true. I would have thought it would take a lot more. Or was that meant as an illustrative point for the argument?

Milos · Post by **Milos** » Mon Dec 11, 2017 6:43 pm

Albert Silver wrote:
Milos wrote:It was not 4 hours of thinking, but 4 hours of (self-)playing 20 million games.
What we don't know is what were the starting positions of those 20 million self-played games.
Dirichlet noise and temperature=1 is far from enough for good exploration of opening games, so I strongly doubt they only used starting positions for those self-played games (but ofc we will never get a confirmation or proof of anything from Google).
Did it really only play 20 million games? I'm amazed if that is true. I would have thought it would take a lot more. Or was that meant as an illustrative point for the argument?

There were in total 44 million games played during 9 hours.
Therefore, for 4 hours it took to overcome SF, it played slightly less than 20 million games.
It is relatively small amount of games, but total number of evaluated positions is quite large since in each game they performed number_of_moves*800 evaluations. If average chess game lasts 60 moves, they performed evaluation of 1 trillion of unique positions and based their training on that.
Moreover, since SF played without the book, the actual openings played are exactly the ones that Alpha0 trained and since MCTS that was used is in essence non-deterministic, the real question is how strong Alpha0 is when positions that are far from those trained are reached. My feeling is not that much, but we would never know.

corres · Post by **corres** » Mon Dec 11, 2017 11:59 pm

[quote="Milos"]

Dirichlet noise and temperature=1 is far from enough for good exploration of opening games, so I strongly doubt they only used starting positions for those self-played games (but ofc we will never get a confirmation or proof of anything from Google).

[/quote]

Start positions of AlphaGoZero were not empty tables nevertheless the learning of AlphaGoZero took much more time than learning of AlphaZero.
In the case of AlphaZero was emphasized that start position for learning was start position of chess. But the relative few (44million) learning games, and the human-like playing style make the possibility that before self-learning they use human games to pre-tuning NN.
Without this it is very dubious to believe AlpaZero discovered everything what GMs developed during centuries.
A note:
Biologists talk about neuron and their network and IT specialist also talk about neuron and their network. But neurons of human and neurons of NN are very different in a lot of aspect . The AlphaZero team wanted to make a chess playing machine and not a machine for modeling human to think about playing chess.

shrapnel · Post by **shrapnel** » Tue Dec 12, 2017 7:17 am

mclane wrote:AZ plays more human than stockfish etc.

Although these programs were made by men.
The grandmasters and IMs and Kiebitzes arround world are all compeletY happy to see steinitz, nimzowitsch and Tal being executed against stockfish.
While stockfish stands there paralyzed and with a bunch of not developed pieces en block , not able to move a piece, not capable not to eat the sacced piece or pawn.

Stockfish plays machine chess .
While AZ plays human chess.

But AZ is the machines machine. While stockfish is the human machine design.

That’s really paradox.

The machine teaches us how to play like a human against stockfish.
By having active pieces and play idealistically.

Fantastic Post ! Sums up the whole situation very nicely, whether Miles agrees or not.
Only thing is, unless AlphaZero gives some more Demonstrations/Information, the voices of its detractors will only grow stronger.

corres · Post by **corres** » Tue Dec 12, 2017 9:16 am

[quote="Henk"]

No wonder that A0 plays more human like for neural network supposed to be a model of human brain

[/quote]

The knowledge about the working of human brain is very-very small.
There is no any working circuit what can model the birth of even a very primitive human conception.
If learning of AlphaZero was started from absolute zero knowledge about playing chess (naturally except of Fide playing rules) the human playing style of AlphaZero is a much more important result of the AlphaZero project than its winning against Stockfish.
In this case I wonder why they emphasized the results of AlphaZero against Stockfish instead of the human style of their system.

Historic Milestone: AlphaZero

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish