AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Dirt · Post by **Dirt** » Thu Dec 07, 2017 8:07 am

Lyudmil Tsvetkov wrote:What better time management, when the match was played at fixed TC?

Right. Since they didn't want to bother putting time management in for AlphaZero they may well have felt that allowing Stockfish to utilize its time management would give Stockfish an unfair advantage. This was a research project, not an all out attempt to make a great engine.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:49 pm

jhellis3 wrote:I would say the result is much more dominating that the Elo difference would suggest. If one looks at the games, it becomes quite clear at how efficient it is at exploiting holes in conventional programs evaluate functions, especially toward the late midgame / early endgame.

Nope, it won most of its games in the early opening, due to the simulated opening book.
How many QID's did it won, only because it fianchettoed its bishop to g2, according to theory, while SF did not do so.
Of the games I browsed, 80% were already decided in the early opening.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:53 pm

lkaufman wrote:
EvgeniyZh wrote:
Milos wrote:
clumma wrote:
Milos wrote:4 hours my ass (pardon my french).
Far fewer transistors and joules were used training AlphaZero than have been used training Stockfish. You can soon rent those TPUs on Google's cloud, or apply for free access now, so stop complaining. Furthermore it's an experimental project in early days and performance is obviously not optimal, so all the 'but-but-but 30 Elo because they used SF 8 instead of SF 8.00194' sounds really dumb.

Days of alpha-beta engines have come to an abrupt end.

-Carl
Sorry, that is pretty childish rent.
Google is obviously comparing apples and oranges and again doing marketing stunt and ppl are falling for it.
Days of Alpha0 on normal hardware are years away. But keep on dreaming, no one can take that from you.

P.S. Just as a small comparison. leelazero open source project trying to replicate alpha0 in Go, took 1 month to get the same games as AG0 got in 3 hours, that with constant 1000 volunteers.
For chess it would take even more.
Training AlphaZero would take tons of time. Just like creating SF from 0. However, running it took 4 TPU, which is comparable to whats available to (rich) consumers - you can get 6-8 NVIDIA V100 which would get you similar performance.
To me this is the most informative post in the whole thread, assuming it is accurate (I know nothing about TPUs). The only reasonable comparison I can think of between the AlphaZero hardware and the Stockfish hardware is cost of equivalent machines. It doesn't matter to me how much hardware was used to reach the current level of strength for both engines, just whether the playing conditions were fair. You seem to be implying that comparable hardware to the 4 TPUs would cost no more (maybe much less?) than the sixty-four core machine used by SF. Is this correct? I'm asking to learn, not making a claim myself either way.

The other conditions were of course not "fair", but reasonable given that AlphaZero only trained for a few hours. I suppose if Stockfish used a good book, was allowed to use its time management as if the time limit were pure increment, and used the latest dev. version, the match would have been much closer, but probably (judging by the infinite win to loss ratio and the actual games) SF would have still lost. The games were amazing.

Bottom line, assuming the comparable cost claim is accurate: If Google wants to optimize the software for a few weeks and sell it, rent it, or give it away, we have a revolution in computer chess. But my guess is that they won't do this, in which case the revolution may be delayed a couple years or so.

Larry, what kind of revolution, this is 30/1 hardware advantage.
Alpha is currently at 2850 level.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:56 pm

Milos wrote:
abulmo2 wrote:Alpha0 may have a good time management too. This is quite trivial to implement. Given the data used by Alpha0, (moves with probabilities of goodness) it should even be possible to provide a better time management for alpha0 than it is for Stockfish.
Speculation, no proof of that.
If it had been possible they wouldn't play 1min/move but some real TC. You think those ppl in Google are stupid and don't know what TC is the most common in chess? Or it is the case that they have intentionally chosen ridiculous TC because it favours Alpha0 the most?

Alpha0 can use an opening book too. And with the computanional power available for it, probably a better one than the Cerebellum book.
Weights of NN are already behaving like a book, any additional book would just degrade the performance. You seems not to quite understand how Alpha0 operates.

And all that on much weaker hardware.
I disagree here. Alpha0 uses more efficient hardware, but not bigger one.
So you find it fair comparing Alpha0 on more efficient (specialized) hardware with SF on general purpose hardware.
SFs eval could easily be ported to FPGAs, there are already available solutions, you use 100s of Xilinx Ultrascale+ boards for eval and a single Haswell CPU for search, something like a DeepBlue. That kind of configuration could easily be 300-400Elo stronger than the current SF on a 64 core machine.

Here I give +10.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 12:58 pm

pilgrimdan wrote:
Lyudmil Tsvetkov wrote:
Lion wrote:I agree with you.
Also what people who claim the HW was much faster..... what they don’t understand is that the thing learned from itself in a very short time!

What if we now give it 1 Year to further learn?

Side note, I looked at the games and they are really impressive!
I would not be surprised, if in a year's time it also starts speaking and thinking of itself.
I, Robot.
there may be a fine line
between self-learning
and self-consciousness ...

I guess it is more like very gross.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:00 pm

Uri Blass wrote:
Lyudmil Tsvetkov wrote:
Uri Blass wrote:
Lyudmil Tsvetkov wrote:
kranium wrote:
Lyudmil Tsvetkov wrote:
clumma wrote:
Lyudmil Tsvetkov wrote:Alpha had considerable hardware advantage
That comparison is not straightforward, but this claim does not seem to be true. SF had 64 threads. I'm not up on the latest scaling behavior of the engine but that has got to be near saturation.

-Carl
From what I gleaned from hardware comparisons, the advantage is 16/1.
Why would one want to run a similar very unfair match?
Only one thing comes to mind: that the company will want to advertise its colossal breakthrough with TPUs and artificial intelligence and then sell its products.

But then, the achievement is not there.
The fact that Google has created a chess playing entity that crushes SF is notable (and fascinating).

TPUs are not for sale, and (at the moment) are applied only to Googles deep learning and research projects,
except when Google donates them to research for free.

https://techcrunch.com/2017/05/17/the-t ... cientists/
What would be the score between SF on 64 cores and SF on 1024 cores out of 100 games?
You think the bigger-hardware SF would score less than 64 points?
I guess at least 80.

So what is so new?
They applied some big hardware, that is all.
The real strength of Alpha is 2850, so around spot 97 or so among engines.
97 is not such a bad achievement, after all.
I doubt if SF on 1024 cores is going to score even 50%
Maybe after some point more cores are counter productive for stockfish.

I also doubt if it is possible to get at least 80 points against stockfish with 64 cores at 1 minute per move.
Why not?
How much would SF 16-cores vs SF single core score, that is easily reproducible.
The experts claim the TPUs lack any SMP inefficiencies.
If you give the engines 10 minutes per move then I doubt so they play almost perfect chess then I guess that you will get less than 80% and 1 minute per move for 64 cores is probably stronger than 10 minutes per move for the 1 core.

Actually, the hardware advantage + simulated opening book(very important, as SF lost most games already in the opening), was close to 50/1, so how will SF 50 cores fare against SF single core?

Certainly somewhere 95% or so.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:03 pm

chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.

Again, 80% of games were already decided in the early opening.
Due to the opening book Alpha unfairly used.
In terms of evaluation, chess is 1000 times more complex than Go, so we will simply never see big advances with this approach.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 1:10 pm

Eelco de Groot wrote:
chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.
I am just looking at the first game where Alpha Zero wins with Black, but it seems to me that it excels specifically in showing big holes in Stockfish'eval That that is in the endgame is not a big surprise.

For instance this position from that game is lost, but even Kaissa needs a long time to see that and it knows a little bit about the power of the bishop pair, but apparently is still blind and this is after going backwards from about move 40...

[D]5bk1/r5p1/7p/2p1N3/4PP2/1P1P2Pb/2P4P/6RK w - -

Engine: Kaissa HT (512 MB)
by T. Romstad, M. Costalba, J. Kiiski, G. Linscott

23/34 0:01 -0.13 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Bg4 44.d4 c4
45.Rb4 Kd6 46.Rxc4 Rxb5 47.Rb4 Bd7
48.h4 (16.663.353) 11013

24/37 0:02 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Bg4 44.d4 c4
45.Rb4 Kd6 46.Rxc4 Rxb5 47.Rb4 Bd7
48.h4 (23.026.784) 10991

25/47 0:02 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 h5 43.Ra1 cxb4
44.Ra6 Be6 45.d4+ Kf6 46.cxb4 Ke7
47.Ra5 Rxb4 48.Rxg5 (28.983.490) 10966

26/43 0:03 -0.08 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 h5 43.Ra1 cxb4
44.Ra6 Be6 45.d4+ Kf6 46.cxb4 Ke7
47.Ra5 Rxb4 48.Rxg5 (35.577.502) 10866

27/42 0:03 -0.15-- 35.Rc1 g5 (36.923.800) 10847

27/42 0:03 -0.16 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Kd6 44.d4 Ra7
45.Kd3 Rf7 46.Ke3 Bg4 47.Kd3 Be6
48.d5 (37.708.832) 10807

28/34 0:04 -0.09++ 35.Rc1 (51.000.093) 10644

28/46 0:05 -0.19 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.d4 Kd6
45.Rb2 c4 46.e5+ Kd5 47.b6 Rb7
48.Rb5+ (53.393.487) 10653

29/48 0:06 -0.12++ 35.Rc1 (68.805.142) 10682

29/48 0:06 -0.23 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.d4 Kd6
45.Rb2 cxd4+ 46.Kxd4 Ra4+ 47.Rb4 Rxb4+
48.cxb4 (74.412.423) 10676

30/43 0:07 -0.16++ 35.Rc1 (82.421.949) 10673

30/46 0:08 -0.19 35.Rc1 g5 36.Kg1 Bg7 37.Kf2 Bxe5
38.fxe5 Kf7 39.Ke3 Ke6 40.Rb1 Kxe5
41.b4 Rb7 42.c3 Ke6 43.b5 Ra7 44.Rb2 Kd6
45.Rf2 Be6 46.b6 Rb7 47.Rf6 Rxb6
48.d4 (93.753.062) 10713

31/49 0:13 -0.12++ 35.Rc1 (139.848.293) 10743

31/49 0:23 -0.08 35.Rc1 Bd6 36.Nc4 Bc7 37.Kg1 Kf7
38.Kf2 Ra2 39.Ke3 Be6 40.e5 g6 41.d4 cxd4+
42.Kxd4 Ra8 43.Kc3 Rc8 44.Kd3 Rd8+
45.Kc3 Ra8 46.Kd4 Ke7 47.Kc3 Rc8
48.Kd3 (244.712.523) 10606
.
.
.
36/56 4:49 -1.11 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 Bc5 40.Nf4 Be3
41.Re1 Bxf4 42.gxf4 Rxc2 43.c5 Rxc5
44.Kg1 Bg4 45.Re3 Rb5 46.Rg3 Bd1
47.Rc3 Rb7 48.Kf2 (3.200.571.460) 11070
.
.
.
41/57 42:36 -1.62 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 h5 40.Nf4 Bg4
41.Kg2 Ba3 42.Rf1 Bc5 43.h3 Rxc2+
44.Kh1 Be2 45.Nxe2 Rxe2 46.Rd1 Rxe4
47.Rd7+ Kf6 48.Kg2 (28.571.692.284) 11175

42/53 43:11 -1.55++ 35.Ng6 (28.964.033.275) 11177

42/54 43:40 -1.57 35.Ng6 Bd6 36.Re1 Kf7 37.f5 Ra2
38.Rc1 c4 39.dxc4 h5 40.Nf4 Bg4
41.Kg2 Ba3 42.Rf1 Bc5 43.h3 Rxc2+
44.Kh1 Be2 45.Nxe2 Rxe2 46.Rd1 Rxe4
47.Rd7+ Kf6 48.Kg2 (29.291.863.573) 11179

43/58 44:39 -1.64-- 35.Ng6 Bd6 (29.972.203.243) 11186

43/64 58:05 -1.57++ 35.Nc4 (38.982.302.684) 11183

43/64 61:17 -1.45++ 35.Nc4 (41.056.108.916) 11164

43/67 65:26 -1.42 35.Nc4 g5 36.Rc1 Bg7 37.Ne5 Bxe5
38.fxe5 Kf7 39.Kg1 Ke6 40.Kf2 Kxe5
41.Ke3 Ra2 42.Kf3 Be6 43.Ke3 Rb2
44.h4 g4 45.b4 Rxb4 46.Ra1 Rb2
47.Kd2 Kd4 48.Ra8 (43.816.279.380) 11159

44/60 69:13 -1.50-- 35.Nc4 g5 (46.318.745.058) 11150

44/63 72:37 -1.57-- 35.Nc4 g5 (48.608.234.304) 11156

44/63 84:14 -1.69-- 35.Nc4 g5 (56.236.277.156) 11126

Lyudmil should appreciate that it is specifically playing Anti-Stockfish chess. Opponent modeling. If it would play against Lyudmil, in 4 hours it would not just beat him but show him where to improve his game. I think Chessbase would love to have a tool like this for sale.

I'm not sure the team is willing to pursue chess, I have not read much of the paper but I understood they are not interested in chess? After Deep Blue beat Kasparov it was no longer interesting to get stronger. And not much to learn from humans anymore...

Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.

It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.

BeyondCritics · Post by **BeyondCritics** » Thu Dec 07, 2017 2:04 pm

cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..

You lost that bet

AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.

...

At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.

kranium · Post by **kranium** » Thu Dec 07, 2017 2:26 pm

Lyudmil Tsvetkov wrote:
chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.
Again, 80% of games were already decided in the early opening.
Due to the opening book Alpha unfairly used.

What opening book?

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo