AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:56 pm

Milos wrote:
abulmo2 wrote:Alpha0 may have a good time management too. This is quite trivial to implement. Given the data used by Alpha0, (moves with probabilities of goodness) it should even be possible to provide a better time management for alpha0 than it is for Stockfish.
Speculation, no proof of that.
If it had been possible they wouldn't play 1min/move but some real TC. You think those ppl in Google are stupid and don't know what TC is the most common in chess? Or it is the case that they have intentionally chosen ridiculous TC because it favours Alpha0 the most?

Alpha0 can use an opening book too. And with the computanional power available for it, probably a better one than the Cerebellum book.
Weights of NN are already behaving like a book, any additional book would just degrade the performance. You seems not to quite understand how Alpha0 operates.

And all that on much weaker hardware.
I disagree here. Alpha0 uses more efficient hardware, but not bigger one.
So you find it fair comparing Alpha0 on more efficient (specialized) hardware with SF on general purpose hardware.
SFs eval could easily be ported to FPGAs, there are already available solutions, you use 100s of Xilinx Ultrascale+ boards for eval and a single Haswell CPU for search, something like a DeepBlue. That kind of configuration could easily be 300-400Elo stronger than the current SF on a 64 core machine.

Here I give +10.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:58 pm

pilgrimdan wrote:
Lyudmil Tsvetkov wrote:
Lion wrote:I agree with you.
Also what people who claim the HW was much faster..... what they don’t understand is that the thing learned from itself in a very short time!

What if we now give it 1 Year to further learn?

Side note, I looked at the games and they are really impressive!
I would not be surprised, if in a year's time it also starts speaking and thinking of itself.
I, Robot.
there may be a fine line
between self-learning
and self-consciousness ...

I guess it is more like very gross.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:00 pm

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
kranium wrote:
Lyudmil Tsvetkov wrote:
clumma wrote:
Lyudmil Tsvetkov wrote:Alpha had considerable hardware advantage
That comparison is not straightforward, but this claim does not seem to be true. SF had 64 threads. I'm not up on the latest scaling behavior of the engine but that has got to be near saturation.

-Carl
From what I gleaned from hardware comparisons, the advantage is 16/1.
Why would one want to run a similar very unfair match?
Only one thing comes to mind: that the company will want to advertise its colossal breakthrough with TPUs and artificial intelligence and then sell its products.

But then, the achievement is not there.
The fact that Google has created a chess playing entity that crushes SF is notable (and fascinating).

TPUs are not for sale, and (at the moment) are applied only to Googles deep learning and research projects,
except when Google donates them to research for free.

https://techcrunch.com/2017/05/17/the-t ... cientists/
What would be the score between SF on 64 cores and SF on 1024 cores out of 100 games?
You think the bigger-hardware SF would score less than 64 points?
I guess at least 80.

So what is so new?
They applied some big hardware, that is all.
The real strength of Alpha is 2850, so around spot 97 or so among engines.
97 is not such a bad achievement, after all.
I doubt if SF on 1024 cores is going to score even 50%
Maybe after some point more cores are counter productive for stockfish.

I also doubt if it is possible to get at least 80 points against stockfish with 64 cores at 1 minute per move.
Why not?
How much would SF 16-cores vs SF single core score, that is easily reproducible.
The experts claim the TPUs lack any SMP inefficiencies.
If you give the engines 10 minutes per move then I doubt so they play almost perfect chess then I guess that you will get less than 80% and 1 minute per move for 64 cores is probably stronger than 10 minutes per move for the 1 core.

Actually, the hardware advantage + simulated opening book(very important, as SF lost most games already in the opening), was close to 50/1, so how will SF 50 cores fare against SF single core?

Certainly somewhere 95% or so.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:03 pm

chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.

Again, 80% of games were already decided in the early opening.
Due to the opening book Alpha unfairly used.
In terms of evaluation, chess is 1000 times more complex than Go, so we will simply never see big advances with this approach.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:10 pm

Eelco de Groot wrote:
chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.
I am just looking at the first game where Alpha Zero wins with Black, but it seems to me that it excels specifically in showing big holes in Stockfish'eval That that is in the endgame is not a big surprise.

For instance this position from that game is lost, but even Kaissa needs a long time to see that and it knows a little bit about the power of the bishop pair, but apparently is still blind and this is after going backwards from about move 40...

[D]5bk1/r5p1/7p/2p1N3/4PP2/1P1P2Pb/2P4P/6RK w - -

Engine: Kaissa HT (512 MB)
by T. Romstad, M. Costalba, J. Kiiski, G. Linscott

23/34 0:01 -0.13 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Bg4 44.d4 c4
45.Rb4 Kd6 46.Rxc4 Rxb5 47.Rb4 Bd7
48.h4 (16.663.353) 11013

24/37 0:02 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Bg4 44.d4 c4
45.Rb4 Kd6 46.Rxc4 Rxb5 47.Rb4 Bd7
48.h4 (23.026.784) 10991

25/47 0:02 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 h5 43.Ra1 cxb4
44.Ra6 Be6 45.d4+ Kf6 46.cxb4 Ke7
47.Ra5 Rxb4 48.Rxg5 (28.983.490) 10966

26/43 0:03 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 h5 43.Ra1 cxb4
44.Ra6 Be6 45.d4+ Kf6 46.cxb4 Ke7
47.Ra5 Rxb4 48.Rxg5 (35.577.502) 10866

27/42 0:03 -0.15-- 35.Rc1 g5 (36.923.800) 10847

27/42 0:03 -0.16 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Kd6 44.d4 Ra7
45.Kd3 Rf7 46.Ke3 Bg4 47.Kd3 Be6
48.d5 (37.708.832) 10807

28/34 0:04 -0.09++ 35.Rc1 (51.000.093) 10644

28/46 0:05 -0.19 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.d4 Kd6
45.Rb2 c4 46.e5+ Kd5 47.b6 Rb7
48.Rb5+ (53.393.487) 10653

29/48 0:06 -0.12++ 35.Rc1 (68.805.142) 10682

29/48 0:06 -0.23 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.d4 Kd6
45.Rb2 cxd4+ 46.Kxd4 Ra4+ 47.Rb4 Rxb4+
48.cxb4 (74.412.423) 10676

30/43 0:07 -0.16++ 35.Rc1 (82.421.949) 10673

30/46 0:08 -0.19 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.Rb2 Kd6
45.Rf2 Be6 46.b6 Rb7 47.Rf6 Rxb6
48.d4 (93.753.062) 10713

31/49 0:13 -0.12++ 35.Rc1 (139.848.293) 10743

31/49 0:23 -0.08 35.Rc1 Bd6 36.Nc4 Bc7 37.Kg1 Kf7
38.Kf2 Ra2 39.Ke3 Be6 40.e5 g6 41.d4 cxd4+
42.Kxd4 Ra8 43.Kc3 Rc8 44.Kd3 Rd8+
45.Kc3 Ra8 46.Kd4 Ke7 47.Kc3 Rc8
48.Kd3 (244.712.523) 10606
.
.
.
36/56 4:49 -1.11 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 Bc5 40.Nf4 Be3
41.Re1 Bxf4 42.gxf4 Rxc2 43.c5 Rxc5
44.Kg1 Bg4 45.Re3 Rb5 46.Rg3 Bd1
47.Rc3 Rb7 48.Kf2 (3.200.571.460) 11070
.
.
.
41/57 42:36 -1.62 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 h5 40.Nf4 Bg4
41.Kg2 Ba3 42.Rf1 Bc5 43.h3 Rxc2+
44.Kh1 Be2 45.Nxe2 Rxe2 46.Rd1 Rxe4
47.Rd7+ Kf6 48.Kg2 (28.571.692.284) 11175

42/53 43:11 -1.55++ 35.Ng6 (28.964.033.275) 11177

42/54 43:40 -1.57 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 h5 40.Nf4 Bg4
41.Kg2 Ba3 42.Rf1 Bc5 43.h3 Rxc2+
44.Kh1 Be2 45.Nxe2 Rxe2 46.Rd1 Rxe4
47.Rd7+ Kf6 48.Kg2 (29.291.863.573) 11179

43/58 44:39 -1.64-- 35.Ng6 Bd6 (29.972.203.243) 11186

43/64 58:05 -1.57++ 35.Nc4 (38.982.302.684) 11183

43/64 61:17 -1.45++ 35.Nc4 (41.056.108.916) 11164

43/67 65:26 -1.42 35.Nc4 g5 36.Rc1 Bg7 37.Ne5 Bxe5
38.fxe5 Kf7 39.Kg1 Ke6 40.Kf2 Kxe5
41.Ke3 Ra2 42.Kf3 Be6 43.Ke3 Rb2
44.h4 g4 45.b4 Rxb4 46.Ra1 Rb2
47.Kd2 Kd4 48.Ra8 (43.816.279.380) 11159

44/60 69:13 -1.50-- 35.Nc4 g5 (46.318.745.058) 11150

44/63 72:37 -1.57-- 35.Nc4 g5 (48.608.234.304) 11156

44/63 84:14 -1.69-- 35.Nc4 g5 (56.236.277.156) 11126

Lyudmil should appreciate that it is specifically playing Anti-Stockfish chess. Opponent modeling. If it would play against Lyudmil, in 4 hours it would not just beat him but show him where to improve his game. I think Chessbase would love to have a tool like this for sale.

I'm not sure the team is willing to pursue chess, I have not read much of the paper but I understood they are not interested in chess? After Deep Blue beat Kasparov it was no longer interesting to get stronger. And not much to learn from humans anymore...

Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.

It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.

BeyondCritics · Post by **BeyondCritics** » Thu Dec 07, 2017 2:04 pm

cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..

You lost that bet

AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.

...

At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.

kranium · Post by **kranium** » Thu Dec 07, 2017 2:26 pm

Lyudmil Tsvetkov wrote:
chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.
Again, 80% of games were already decided in the early opening.
Due to the opening book Alpha unfairly used.

What opening book?

Jhoravi · Post by **Jhoravi** » Thu Dec 07, 2017 2:44 pm

Do you all think that AlphaZero from scratch only knows the rules of chess and nothing more? How about each piece values like 1 for pawn and 3 for knight etc? I would be so amazed if it learned all the pieces values from self play.

EvgeniyZh · Post by **EvgeniyZh** » Thu Dec 07, 2017 2:45 pm

Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
kranium wrote:
Lyudmil Tsvetkov wrote:
clumma wrote:
Lyudmil Tsvetkov wrote:Alpha had considerable hardware advantage
That comparison is not straightforward, but this claim does not seem to be true. SF had 64 threads. I'm not up on the latest scaling behavior of the engine but that has got to be near saturation.

-Carl
From what I gleaned from hardware comparisons, the advantage is 16/1.
Why would one want to run a similar very unfair match?
Only one thing comes to mind: that the company will want to advertise its colossal breakthrough with TPUs and artificial intelligence and then sell its products.

But then, the achievement is not there.
The fact that Google has created a chess playing entity that crushes SF is notable (and fascinating).

TPUs are not for sale, and (at the moment) are applied only to Googles deep learning and research projects,
except when Google donates them to research for free.

https://techcrunch.com/2017/05/17/the-t ... cientists/
What would be the score between SF on 64 cores and SF on 1024 cores out of 100 games?
You think the bigger-hardware SF would score less than 64 points?
I guess at least 80.

So what is so new?
They applied some big hardware, that is all.
The real strength of Alpha is 2850, so around spot 97 or so among engines.
97 is not such a bad achievement, after all.
I doubt if SF on 1024 cores is going to score even 50%
Maybe after some point more cores are counter productive for stockfish.

I also doubt if it is possible to get at least 80 points against stockfish with 64 cores at 1 minute per move.
Why not?
How much would SF 16-cores vs SF single core score, that is easily reproducible.
The experts claim the TPUs lack any SMP inefficiencies.
If you give the engines 10 minutes per move then I doubt so they play almost perfect chess then I guess that you will get less than 80% and 1 minute per move for 64 cores is probably stronger than 10 minutes per move for the 1 core.
Actually, the hardware advantage + simulated opening book(very important, as SF lost most games already in the opening), was close to 50/1, so how will SF 50 cores fare against SF single core?

Certainly somewhere 95% or so.

Are you aware that the dependency of TTD vs. core number is sublinear? 64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?

Robert Pope · Post by **Robert Pope** » Thu Dec 07, 2017 4:46 pm

Jhoravi wrote:Do you all think that AlphaZero from scratch only knows the rules of chess and nothing more? How about each piece values like 1 for pawn and 3 for knight etc? I would be so amazed if it learned all the pieces values from self play.

Yes. That's the whole point of what makes this so incredible. The fact that they can take nothing more than the rules and definitions for game outcomes and create a superhuman player makes this something that can be applied far outside the chess domain. Except, it isn't even bothering to learn piece values. It is learning which moves are better in different positions without depending on crutches like that.

Also, learning something as simple as piece values from self play is no big deal and has been easily accomplished by other machine learning programs.

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo