Historic Milestone: AlphaZero

Rebel · Post by **Rebel** » Thu Dec 07, 2017 11:16 am

Can't believe it without a press release from Google.

corres · Post by **corres** » Thu Dec 07, 2017 11:54 am

[quote="maac"]

Take note that SF was at 64 cores!

[/quote]

Another VERY IMPORTANT notes:
1, SF used only 1 GB (!!) hash table
2, Alpha Zero did not start from zero knowledge about chess
because it was feeded a lot of human games at start up.
This is the explanation why Alpha Zero plays openings so human like.
I think it would be more correct if Stockfish would get 64 GB hash
and a good human opening book like Fritz Power Book.

duncan · Post by **duncan** » Thu Dec 07, 2017 12:36 pm

JJJ wrote:ALphaZero did win Stockfish in 100 game scored 64/100 that is 98 elo stronger than Stockfish TCEC 2016. Not bad ! I d like to see it more trained to see how far from perfect Stockfish really is !

and it did this after just 4 hours training. ? not sure why they did not train for at least a week to get an even better result.

duncan · Post by **duncan** » Thu Dec 07, 2017 12:39 pm

maac wrote:https://arxiv.org/pdf/1712.01815.pdf

To say that i'm impressed is a understatement. Shocking.
Take note that SF was at 64 cores!

25 wins for white and only 3 wins for black. is such a big difference what you would expect ?

Werewolf · Post by **Werewolf** » Thu Dec 07, 2017 12:50 pm

corres wrote:
maac wrote:
Take note that SF was at 64 cores!

Another VERY IMPORTANT notes:
1, SF used only 1 GB (!!) hash table
2, Alpha Zero did not start from zero knowledge about chess
because it was feeded a lot of human games at start up.
This is the explanation why Alpha Zero plays openings so human like.
I think it would be more correct if Stockfish would get 64 GB hash
and a good human opening book like Fritz Power Book.

THIS.

1 GB hash allows them to say SF nps are really high while hiding that they’ve weakened its search.

I would like to see a rematch at a better time control, much more hash, all tablebases, and a tournament quality opening book and the latest asmfish.

I bet it’d be much closer then.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:43 pm

MikeGL wrote:
kranium wrote:
Lyudmil Tsvetkov wrote:It is not at all clear to me where were books used and where not.
I'm sure opening books were not used...
In the early self-play games things like 1.a3, 1.a4, etc. were probably tried by AlphaZero...
eventually it learned that 1. e4 or 1. d4 had the highest success rates.
Books or no books, I think AlphaZero would still demolish SF8.
Just look at this game 9, it was a decent French Defence by SF8, but it was dismantled with
amazing tactical and strategic shots by AlphaZero which seems to be beyond the reach of alpha-beta engines.

[pgn]
[Event "?"]
[Site "?"]
[Date "2017.12.06"]
[Round "9"]
[White "AlphaZero"]
[Black "Stockfish"]
[Result "1-0"]
[TimeControl "40/1260:300"]
[Termination "normal"]
[PlyCount "103"]
[WhiteType "human"]
[BlackType "human"]

1. d4 e6 2. e4 d5 3. Nc3 Nf6 4. e5 Nfd7 5. f4 c5 6. Nf3 cxd4 7. Nb5 Bb4+ 8.
Bd2 Bc5 9. b4 Be7 10. Nbxd4 Nc6 11. c3 a5 12. b5 Nxd4 13. cxd4 Nb6 14. a4
Nc4 15. Bd3 Nxd2 16. Kxd2 Bd7 17. Ke3 b6 18. g4 h5 19. Qg1 hxg4 20. Qxg4
Bf8 21. h4 Qe7 22. Rhc1 g6 23. Rc2 Kd8 24. Rac1 Qe8 25. Rc7 Rc8 26. Rxc8+
Bxc8 27. Rc6 Bb7 28. Rc2 Kd7 29. Ng5 Be7 30. Bxg6 Bxg5 31. Qxg5 fxg6 32. f5
Rg8 33. Qh6 Qf7 34. f6 Kd8 35. Kd2 Kd7 36. Rc1 Kd8 37. Qe3 Qf8 38. Qc3 Qb4
39. Qxb4 axb4 40. Rg1 b3 41. Kc3 Bc8 42. Kxb3 Bd7 43. Kb4 Be8 44. Ra1 Kc7
45. a5 Bd7 46. axb6+ Kxb6 47. Ra6+ Kb7 48. Kc5 Rd8 49. Ra2 Rc8+ 50. Kd6 Be8
51. Ke7 g5 52. hxg5 1-0
[/pgn]

not sure if 18.g4!, 30.Bxg6! and other would be found by current engines.

[d]r2qk2r/3bbppp/1p2p3/pP1pP3/P2P1P2/3BKN2/6PP/R2Q3R w kq - 0 18
After 17...b6 of black, can some engine consider 18.g4! in this position?

xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?

Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.

I am considering g4 and Bg6, especially g4 was my first choice in under a second.
People don't believe frequently advanced long chains are stronger than a pice, but see what happens here...

I guess the main fault is thay are testing at 1 minute. Long chains have failed in SF at least 5 times or so, and still they are an extremely valid concept.

Books are important, as I have been correctly claiming for a very long time, the French is dangerous or even lost, but people have hard ears.
The game was lost much earlier, already in the opening.

MikeGL · Post by **MikeGL** » Thu Dec 07, 2017 1:49 pm

Lyudmil Tsvetkov wrote:
MikeGL wrote:
kranium wrote:
Lyudmil Tsvetkov wrote:It is not at all clear to me where were books used and where not.
I'm sure opening books were not used...
In the early self-play games things like 1.a3, 1.a4, etc. were probably tried by AlphaZero...
eventually it learned that 1. e4 or 1. d4 had the highest success rates.
Books or no books, I think AlphaZero would still demolish SF8.
Just look at this game 9, it was a decent French Defence by SF8, but it was dismantled with
amazing tactical and strategic shots by AlphaZero which seems to be beyond the reach of alpha-beta engines.

[pgn]
[Event "?"]
[Site "?"]
[Date "2017.12.06"]
[Round "9"]
[White "AlphaZero"]
[Black "Stockfish"]
[Result "1-0"]
[TimeControl "40/1260:300"]
[Termination "normal"]
[PlyCount "103"]
[WhiteType "human"]
[BlackType "human"]

1. d4 e6 2. e4 d5 3. Nc3 Nf6 4. e5 Nfd7 5. f4 c5 6. Nf3 cxd4 7. Nb5 Bb4+ 8.
Bd2 Bc5 9. b4 Be7 10. Nbxd4 Nc6 11. c3 a5 12. b5 Nxd4 13. cxd4 Nb6 14. a4
Nc4 15. Bd3 Nxd2 16. Kxd2 Bd7 17. Ke3 b6 18. g4 h5 19. Qg1 hxg4 20. Qxg4
Bf8 21. h4 Qe7 22. Rhc1 g6 23. Rc2 Kd8 24. Rac1 Qe8 25. Rc7 Rc8 26. Rxc8+
Bxc8 27. Rc6 Bb7 28. Rc2 Kd7 29. Ng5 Be7 30. Bxg6 Bxg5 31. Qxg5 fxg6 32. f5
Rg8 33. Qh6 Qf7 34. f6 Kd8 35. Kd2 Kd7 36. Rc1 Kd8 37. Qe3 Qf8 38. Qc3 Qb4
39. Qxb4 axb4 40. Rg1 b3 41. Kc3 Bc8 42. Kxb3 Bd7 43. Kb4 Be8 44. Ra1 Kc7
45. a5 Bd7 46. axb6+ Kxb6 47. Ra6+ Kb7 48. Kc5 Rd8 49. Ra2 Rc8+ 50. Kd6 Be8
51. Ke7 g5 52. hxg5 1-0
[/pgn]

not sure if 18.g4!, 30.Bxg6! and other would be found by current engines.

[d]r2qk2r/3bbppp/1p2p3/pP1pP3/P2P1P2/3BKN2/6PP/R2Q3R w kq - 0 18
After 17...b6 of black, can some engine consider 18.g4! in this position?

xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
[d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?

Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.
I am considering g4 and Bg6, especially g4 was my first choice in under a second.
People don't believe frequently advanced long chains are stronger than a pice, but see what happens here...

I guess the main fault is thay are testing at 1 minute. Long chains have failed in SF at least 5 times or so, and still they are an extremely valid concept.

Books are important, as I have been correctly claiming for a very long time, the French is dangerous or even lost, but people have hard ears.
The game was lost much earlier, already in the opening.

LOL there goes your rediculous claims again. You haven't claimed French is lost. Lets be
honest please, you only claimed French WINAWER is lost, nothing else.
You claimed all other variations of French are playable but not the WINAWER line.

Winawer discussion was a thread on its own, I remember clearly.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:49 pm

MikeGL wrote:
Lyudmil Tsvetkov wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:The training matches are different from the 100 games match with Stockfish.
Yes, the plot on the diagram is the training game, but 100 games per openning was played. 50-50, and the score below the diagram is on AlphaZero perspective.

12 openings with reversed colours don't square in any way with 100 played games, so did they actually left some openings played more than others, or did not they flip colours?
12 opennings x 100 = 1,200 games total.
Before we were talking about 300 and 100, now 1200 suddenly appears...
The 64/36 score certainly comes from 100 games, unless they assigned random points for a win.
And in that sample, I see Alpha playing just 1.d4 and 1.Nf3.
Read the series of posts properly. It is 100 games per openning, you clearly don't understand Table 2.

300 games because you were talking about 1.e4 earlier which appears in 6 diagrams.
How many is 50 x 6 ?
You were claiming that AlphaZero didn't play 1.e4, i told you it did! 300 times it did play 1.e4 against SF8.

See the total summation below: 1,200 games for all 12 opennings. Come on man, even this very
basic stuff we argue?
Do you have the pgn for the training games, which, btw., are claimed to run into the thousands?
Note that all Training games are self-play (no SF8 involved). The 1,200 are all match games against SF8.
No data given in PDF about the total number of self-play for learning, neither were the self-play PGN published. Only SF8 match was published.
100 game match versus SF8 was played on all 12 common ECO openings.
I guess you are confused of the plotted graph being put beside the
diagram. The graph is self-play, the diagram is SF8 match.
Try to read the caption of Table 2 properly. 37 times if need be.

I am unable to read the pdf at all, when I open it, and I get a headache.
I usually don't read books that reference material; you either do something original, or better do nothing at all.

I guess too much info is simply tremendously unclear.
You still persist with your claim the 100-game match was played with 12 different openings.
100%12=4, so there is a remainder, and this is fully absurd.

MikeGL · Post by **MikeGL** » Thu Dec 07, 2017 1:55 pm

Lyudmil Tsvetkov wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:The training matches are different from the 100 games match with Stockfish.
Yes, the plot on the diagram is the training game, but 100 games per openning was played. 50-50, and the score below the diagram is on AlphaZero perspective.

12 openings with reversed colours don't square in any way with 100 played games, so did they actually left some openings played more than others, or did not they flip colours?
12 opennings x 100 = 1,200 games total.
Before we were talking about 300 and 100, now 1200 suddenly appears...
The 64/36 score certainly comes from 100 games, unless they assigned random points for a win.
And in that sample, I see Alpha playing just 1.d4 and 1.Nf3.
Read the series of posts properly. It is 100 games per openning, you clearly don't understand Table 2.

300 games because you were talking about 1.e4 earlier which appears in 6 diagrams.
How many is 50 x 6 ?
You were claiming that AlphaZero didn't play 1.e4, i told you it did! 300 times it did play 1.e4 against SF8.

See the total summation below: 1,200 games for all 12 opennings. Come on man, even this very
basic stuff we argue?
Do you have the pgn for the training games, which, btw., are claimed to run into the thousands?
Note that all Training games are self-play (no SF8 involved). The 1,200 are all match games against SF8.
No data given in PDF about the total number of self-play for learning, neither were the self-play PGN published. Only SF8 match was published.
100 game match versus SF8 was played on all 12 common ECO openings.
I guess you are confused of the plotted graph being put beside the
diagram. The graph is self-play, the diagram is SF8 match.
Try to read the caption of Table 2 properly. 37 times if need be.
100%12=4, so there is a remainder, and this is fully absurd.

1,200 games it was, read the table correctly. You are insisting total games were only 100, yes 100 for EACH openning, but there are 12 opennings so 12 times 100 is 1,200.
Are you serious? This is pre-school basic math

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:58 pm

KWRegan wrote:
Lyudmil Tsvetkov wrote:Why don't they disclose what their evaluation is: that will be a big step towards knowing the truth.
They can't. The evaluation is a sequence of numbers specifying myriad weights on umpteen-dozen layers of a neural network. This aspect (of the original AlphaGo) in contrast to Stockfish is addressed in my Feb. 2016 article https://rjlipton.wordpress.com/2016/02/07/magic-to-do/ That this is endemic to "deep learning" has energized a counter-push toward "Explainable AI."

What I wish to know better, incidentally, is the memory footprint of their trained network and how portable it is.

In what way the fact that it uses exponentially more eval terms/neurons than SF makes it impossible to release the code?

They are still tuning at the level of a 2850 single core engine, so things will just get significantly more difficult in the future, when the quality of the terms will have much higher impact.

Btw., if they did that in 4 hours, then, some 400 hours from now, that is, in 2 weeks or so, they will have solved chess.
You really believe in that?
Let's bet there will not be new update in 2 weeks' time claiming they have reached the 4000 elo level?

Not only that, but there will not be a new update in a month or 2 months' time. And probably in half a year too.

That tells it all about the 4 hours estimate...

Historic Milestone: AlphaZero

Re: MCTS-NN vs alpha-beta

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: MCTS-NN vs alpha-beta

Re: MCTS-NN vs alpha-beta

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish