Historic Milestone: AlphaZero

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: MCTS-NN vs alpha-beta

Post by MikeGL »

pferd wrote:
MikeGL wrote: [d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?


Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.
A slightliy modified version of Stockfish gets the second position right but it needed more than 13 mins (depth 47) with 6 Threads on my machine to bring this move to the top.

The final output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a
bestmove d3g6 ponder e7g5
I also tried the first position but Stockfish did not come up with g4 after 1/2 hour.
g2g6 f8h8 in your above line looks incorrect, an illegal move.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: MCTS-NN vs alpha-beta

Post by Lyudmil Tsvetkov »

pferd wrote:
MikeGL wrote: [d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?


Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.
A slightliy modified version of Stockfish gets the second position right but it needed more than 13 mins (depth 47) with 6 Threads on my machine to bring this move to the top.

The final output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a
bestmove d3g6 ponder e7g5
I also tried the first position but Stockfish did not come up with g4 after 1/2 hour.
Because the middlegame is much more complex.
MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: Much weaker than Stockfish

Post by MikeGL »

Lyudmil Tsvetkov wrote:
KWRegan wrote:
Lyudmil Tsvetkov wrote:Why don't they disclose what their evaluation is: that will be a big step towards knowing the truth.
They can't. The evaluation is a sequence of numbers specifying myriad weights on umpteen-dozen layers of a neural network. This aspect (of the original AlphaGo) in contrast to Stockfish is addressed in my Feb. 2016 article https://rjlipton.wordpress.com/2016/02/07/magic-to-do/ That this is endemic to "deep learning" has energized a counter-push toward "Explainable AI."


What I wish to know better, incidentally, is the memory footprint of their trained network and how portable it is.
They are still tuning at the level of a 2850 single core engine, so things will just get significantly more difficult in the future, when the quality of the terms will have much higher impact.
Mentioned in the paper, the eval is non-linear, not like the current engines that
uses linear eval functions. They are not tuning the eval, the AI itself is tuning the eval
autonomously without human input.
pferd
Posts: 134
Joined: Thu Jul 24, 2014 2:49 pm

Re: MCTS-NN vs alpha-beta

Post by pferd »

MikeGL wrote:
pferd wrote:
MikeGL wrote: [d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?


Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.
A slightliy modified version of Stockfish gets the second position right but it needed more than 13 mins (depth 47) with 6 Threads on my machine to bring this move to the top.

The final output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a
bestmove d3g6 ponder e7g5
I also tried the first position but Stockfish did not come up with g4 after 1/2 hour.
g2g6 f8h8 in your above line looks incorrect, an illegal move.
Nice catch! Something went terribly wrong when copying the output from the console window.

This is the correct output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 g4g5 f7g6 f4f5 h8g8 g5h6 e8f7 f5f6 d7d8 e3d2 b7c8 h6e3 c8b7 d2c1 f7h7 e3a3 h7h6 c1d1 h6f8 a3c3 f8f7 d1c1 g8h8 c3a3 f7f8 a3f8 h8f8 c2g2 b7c8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a6a5 d2d4 a5a7 f7f8 b5b6 d4b4 b6b7 d5d4 g3f4 b4b2 f6f7 e8a4 a7a8 f8f7 b7b8q b2b8 a8b8 a4c6 b8b4 d4d3 f4e3 f7e7 e3d3
MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: MCTS-NN vs alpha-beta

Post by MikeGL »

pferd wrote:
MikeGL wrote:
pferd wrote:
MikeGL wrote: [d]4q2r/1b1kbp2/1p2p1p1/pP1pP1N1/P2P1PQP/3BK3/2R5/8 w - - 6 30
After 29...Be7, can current engines consider 30.Bxg6! here?


Would be nice if we can try to feed some difficult epd positions into AlphaZero,
to estimate its ELO strength.
A slightliy modified version of Stockfish gets the second position right but it needed more than 13 mins (depth 47) with 6 Threads on my machine to bring this move to the top.

The final output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a
bestmove d3g6 ponder e7g5
I also tried the first position but Stockfish did not come up with g4 after 1/2 hour.
g2g6 f8h8 in your above line looks incorrect, an illegal move.
Nice catch! Something went terribly wrong when copying the output from the console window.

This is the correct output:

Code: Select all

info depth 49 seldepth 82 multipv 1 score cp 121 nodes 15423323813 nps 9884116 hashfull 999 tbhits 8278452 time 1560415 pv d3g6 e7g5 g4g5 f7g6 f4f5 h8g8 g5h6 e8f7 f5f6 d7d8 e3d2 b7c8 h6e3 c8b7 d2c1 f7h7 e3a3 h7h6 c1d1 h6f8 a3c3 f8f7 d1c1 g8h8 c3a3 f7f8 a3f8 h8f8 c2g2 b7c8 g2g6 f8h8 h4h5 d8e8 c1d2 c8d7 h5h6 h8h7 g6g8 e8f7 g8b8 h7h6 b8b6 h6h2 d2d3 h2h3 d3e2 d7e8 b6a6 h3h2 e2f3 h2h3 f3f2 h3h2 f2g3 h2d2 a6a5 d2d4 a5a7 f7f8 b5b6 d4b4 b6b7 d5d4 g3f4 b4b2 f6f7 e8a4 a7a8 f8f7 b7b8q b2b8 a8b8 a4c6 b8b4 d4d3 f4e3 f7e7 e3d3
Thanks for this line. Looks like your SF modded version finds the winning line played by AlphaZero correctly.
pferd
Posts: 134
Joined: Thu Jul 24, 2014 2:49 pm

Re: MCTS-NN vs alpha-beta

Post by pferd »

MikeGL wrote:
Thanks for this line. Looks like your SF modded version finds the winning line played by AlphaZero correctly.
There is nothing special to this version. It contains just 2 small changes:
1) I removed the lazy eval.
2) I applied pull request #1289

I will give stockfish-master a try and see how things work out...
duncan
Posts: 12038
Joined: Mon Jul 07, 2008 10:50 pm

Re: Historic Milestone: AlphaZero

Post by duncan »

Lyudmil Tsvetkov wrote:
duncan wrote:
maac wrote:https://arxiv.org/pdf/1712.01815.pdf

To say that i'm impressed is a understatement. Shocking.
Take note that SF was at 64 cores!
25 wins for white and only 3 wins for black. is such a big difference what you would expect ?
When I have been claiming, the stronger the engines, the more they score with white, everyone has been laughing.
At some point, there will be only 11111111111111111s.
but it is 'only' 100 elo stronger
pferd
Posts: 134
Joined: Thu Jul 24, 2014 2:49 pm

Re: MCTS-NN vs alpha-beta

Post by pferd »

pferd wrote: I will give stockfish-master a try and see how things work out...

Code: Select all

info depth 49 seldepth 93 multipv 1 score cp 160 nodes 20247541053 nps 10265501 hashfull 999 tbhits 10770118 time 1972387 pv d3g6 e7g5 g4g5 f7g6 f4f5 h8g8 g5h6 e8f7 f5f6 d7d8 e3f3 g6g5 h4g5 g8g6 h6h4 g6g8 c2f2 b7c8 f3g4 c8d7 h4h5 g8g6 f2h2 f7g8 g4f4 d7e8 h5h7 e8f7 h7g8 g6g8 h2c2 g8h8 g5g6 f7g6 c2c6 g6f7 c6b6 d8c8 b6a6 c8b7 a6a5 h8h4 f4e3 h4h3 e3d2 f7g6 a5a6 h3d3 d2c2 d3d4 c2b3 d4d3 b3b4 d3d4 b4c5 d4c4 c5d6 d5d4 d6e6 d4d3 a6d6 c4c8 d6d7 b7b6
Master finds the key move too, but it has a slightly different pv which should be winning, too.
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: Historic Milestone: AlphaZero

Post by Milos »

Werewolf wrote:1 GB hash allows them to say SF nps are really high while hiding that they’ve weakened its search.

I would like to see a rematch at a better time control, much more hash, all tablebases, and a tournament quality opening book and the latest asmfish.

I bet it’d be much closer then.
If you look at results of B40 Sicilian difference is only 38.5Elo.
With the lastest Brainfish using Cerebellum limited to for example 12 moves, 32GB hash, 6-men Syzygy and 60'+15'' TC I'm pretty confident Alpha0 would not win, at the best result would be indecisive.
If I had access to the training games, I could construct the book that would give SF +100 Elo advantage without much trouble.
Lion
Posts: 531
Joined: Fri Mar 31, 2006 1:26 pm
Location: Switzerland

Re: Historic Milestone: AlphaZero

Post by Lion »

While I understand why you would like to do that (so would I), I believe it will never happen.
This is surely not the original intend of the team that worked on AlphaZero.

They just proved that by letting an IA develop within a few hours it could get more than a match for a regular program on which several thousands of hours were spent....

Unfortunately, the proof is done and they will likely move to the next things. SF with opening book, this or that..... is most probably of little interest to them.

rgds