AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

jefk · Post by **jefk** » Thu Dec 07, 2017 7:50 pm

had a quick look, and it confirms my findings in the Infinity Chess Tourn
earlier this year, where i could beat stockfish sometimes as 'centaur' (using Houdini 4
and Komodo10 on two comps).
Also, the opening book used by Stockfish in this match was weak
if i compare it with my own knowledge as correspondence player (accumulated with
some 25+yrs of experience in computerchess); and not commenting on Tsvetkov's ideas btw.
In line with the analysis as posted on the lichess site, Alfazero didn't
make weak opening moves (although it sometimes could have been better), whereas
Stockfish made such moves as .. Bb7 in Queens Indian instead of the modern ...Ba6 !
(although even then it's not easy (but still possible imho) to equalize.)
But even then, most wins were obtained in the transition between middle game and
endgame, a traditional Komodo monopoly i thought, until now.
Concluding, Alfazero has lots of potential, on stronger comps,
on my comp i prefer Houdini 6 and my own opening book,
combined with some human judgment sometimes and then the
results are still comparable with Alfazero i think.
So bottom-line question is whether Hassabis will continue this chess project,
being interested to find out how perfect chess games will look like.

jef

PS as for the question whether chess is a draw, ask him, i suggest,
about the results of Alfazero playing against itself after days/weeks
of training; he will ofcourse confirm that chess is a draw. (and then
it doesn't matter if you play d4 or e4, although for the time being
d4 and Nf3 give the best practical chances of course; with 1.c4
being a tricky move, if you don't know how to equalize for Black)

www.bookbuilder.nl

Milos · Post by **Milos** » Thu Dec 07, 2017 7:54 pm

EvgeniyZh wrote:
Milos wrote: First you have to understand what TPU is. There is enough material on that, published by no one else but Google.
https://arxiv.org/abs/1704.04760
I do know enough about TPU, and even developed similar system by myself

Sorry kiddo, that empty boasting won't help much, your previous post clearly suggests you have no clue what TPU is, and have not read that seminal paper.

Milos wrote: Second generation TPUs performance is 180 TOPS.
TFLOPS, per pod (4 TPUs)
Milos wrote:It is int8 multiplication not single or double floating point precision operations you are used to from common GPUs and NVIDIA in general
NVIDIA has GPUs supporting half precision FP and INT8.

So basically every word you are writing is misleading BS which has nothing to do with truth, since you either not ready to spend two minutes googling (pun intended) or just jealous and try to mislead others.

Single precision (or ints) are used for inference, (dp) floating point for training. Why, I leave you to figure it out (that is pretty basic stuff btw.).
4 BTUs that were used for inference are obviously used in single-precision FP or int mode, plus it was never stated in the paper which BTUs they used, so could have been also gen 1 (even though less probable).
You didn't demonstrate anything that I wrote was wrong, but just basically confirmed it and proved that you were wrong.

It is clear from your writing you are some kiddo (probably got hold of his first ML course, or even Google intership and is now overexcited and as most of the youth full of himself), so I wouldn't hold you in a discussion any more.
When you have actually some substance to write about, then you can come back.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Thu Dec 07, 2017 7:57 pm

In the 10-games sample, which is freely accessible, I see the following:

Game 1 and 2 feature this position:
[d]r1bqk2r/ppp2ppp/2p2n2/2b1p3/4P3/3P1N2/PPP2PPP/RNBQK2R w KQkq - 0 6

SF has traded bishop for knight and on move 6, already has worse position.

Game 3:

[d]rn1qkb1r/p2p1ppp/bp2pn2/2pP4/2P5/5NP1/PPQ1PP1P/RNB1KB1R b KQkq - 0 6

On move 6, SF already is much worse, if not lost at all.

Game 4:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is already considerably worse.

Games 5 and 6:

[d]rn1q1rk1/pbppbppp/1p2pn2/3P4/2P5/5NP1/PP2PPBP/RNBQ1RK1 b - - 0 7

On move 7, SF is much much worse.

Game 9:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is considerably worse.

Game 10: repetition of games 5 and 6

So, actually, only games 7 and 8 featured more balanced opening, all the rest was decided very early into the opening, with Alpha having trained the opening on human games. SF, on the other hand, does not rely on human opening knowledge, which is much superior.

So that my assessment 80% of the games were decided by the in-built opening knowledge is fully correct.

With that, my assessment is that the lack of openin gbook was the biggest disadvantage to SF.

It was basically an openings book match.

Jesse Gersenson · Post by **Jesse Gersenson** » Thu Dec 07, 2017 8:13 pm

clumma wrote:A truly stunning result. Matthew Lai is a coauthor!

https://arxiv.org/pdf/1712.01815.pdf

-Carl

Wa to go Matthew!

Milos · Post by **Milos** » Thu Dec 07, 2017 8:20 pm

EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?

Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?

EvgeniyZh · Post by **EvgeniyZh** » Thu Dec 07, 2017 8:25 pm

Milos wrote: Sorry kiddo, that empty boasting won't help much, your previous post clearly suggests you have no clue what TPU is, and have not read that seminal paper.

You are welcom to find my paper when you learn google

Milos wrote: (dp) floating point for training.
Why, I leave you to figure it out (that is pretty basic stuff btw.).

So you actually think that double precision is used for training?

I leave you to figure out why dp isn't used in deep learning at all (this is known by anyone who ever done deep learning btw.

Milos wrote: You didn't demonstrate anything that I wrote was wrong, but just basically confirmed it and proved that you were wrong.

You claimed TPU is 180 TOPS, while it is 180 TFLOPS per pod of four TPU, thus your evaluation 4 times higher than it should be. You don't mention that P100/P40 support half precision FP, which provides twice as much FLOPS. Which puts us to 10 P100 to fit 4 TPUs. You claim V100 Tensor cores are shit, though I'm sure you have never used them. You claim V100 costs $10000, while it actualy costs $8000.

Milos wrote: It is clear from your writing you are some kiddo (probably got hold of his first ML course, or even Google intership and is now overexcited and as most of the youth full of himself), so I wouldn't hold you in a discussion any more.
When you have actually some substance to write about, then you can come back.

It's clear that you have no arguments and thus switch to ad hominem. You've also demonstrated that you know about deep learning a bit less than nothing.

supersharp77 · Post by **supersharp77** » Thu Dec 07, 2017 8:29 pm

Note:To whom it may concern...We refuse to accept the conclusions of this "so called" defeat of Stockfish 8 by 'Alpha Zero' (a program no one seems to have heard of before this so called "result was published") If Google wants the chess engine community to accept these "conclusions" follow the 'RULES'
of CHESS ENGINE TESTING..and Enter 'ALPHA ZERO" in TCEC or some other
Recognized chess engine tournament...and let the Chips fall where they mayl....until then....The "Google Teams" results are 100% meaningless!!

EvgeniyZh · Post by **EvgeniyZh** » Thu Dec 07, 2017 8:29 pm

Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?

Ok, so make a thought experiment. You may even actually perform it to get the point. Take two instances of Stockfish. Run a tournament between them, that'd emerge rating for both, which would be around 0 if you made everything right. Now increase the time control. Repeat. You'll still get zero, if you are still doing things right. Would that mean that there wasn't improvement? Do you get the concept now?

P.S. you also didn't even read what's written on the figure itself, did you?)

Milos · Post by **Milos** » Thu Dec 07, 2017 8:37 pm

EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?
Ok, so make a thought experiment. You may even actually perform it to get the point. Take two instances of Stockfish. Run a tournament between them, that'd emerge rating for both, which would be around 0 if you made everything right. Now increase the time control. Repeat. You'll still get zero, if you are still doing things right. Would that mean that there wasn't improvement? Do you get the concept now?

Y-scale in Fig.2 means Elo without added reference point (that is what that clumsy term google used actually means). It doesn't mean Elo difference between 2 programs, because if it was Elo difference you'd need a single curve, not 2 of them (it's pretty obvious if you just used your head a bit).
Beside not understanding much, you really don't seem too bright.
Sorry, but I don't want to waste any more time with you, it's pointless.

lkaufman · Post by **lkaufman** » Thu Dec 07, 2017 8:38 pm

Lyudmil Tsvetkov wrote:In the 10-games sample, which is freely accessible, I see the following:

Game 1 and 2 feature this position:
[d]r1bqk2r/ppp2ppp/2p2n2/2b1p3/4P3/3P1N2/PPP2PPP/RNBQK2R w KQkq - 0 6

SF has traded bishop for knight and on move 6, already has worse position.

Game 3:

[d]rn1qkb1r/p2p1ppp/bp2pn2/2pP4/2P5/5NP1/PPQ1PP1P/RNB1KB1R b KQkq - 0 6

On move 6, SF already is much worse, if not lost at all.

Game 4:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is already considerably worse.

Games 5 and 6:

[d]rn1q1rk1/pbppbppp/1p2pn2/3P4/2P5/5NP1/PP2PPBP/RNBQ1RK1 b - - 0 7

On move 7, SF is much much worse.

Game 9:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is considerably worse.

Game 10: repetition of games 5 and 6

So, actually, only games 7 and 8 featured more balanced opening, all the rest was decided very early into the opening, with Alpha having trained the opening on human games. SF, on the other hand, does not rely on human opening knowledge, which is much superior.

So that my assessment 80% of the games were decided by the in-built opening knowledge is fully correct.

With that, my assessment is that the lack of openin gbook was the biggest disadvantage to SF.

It was basically an openings book match.

All of those positions are ones normally played by Grandmasters and considered to offer White no more than his normal opening edge. Maybe the French defense is a little better for White than say the Berlin. I'm pretty sure that Alpha zero would have held the draw playing the other side of them. The gambits played in the Queen's Indian are tricky, but not objectively much better for White.

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Chess content and openings

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero Supposedly Beat Stockfish 8

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: Chess content and openings