AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

EvgeniyZh · Post by **EvgeniyZh** » Thu Dec 07, 2017 8:25 pm

Milos wrote: Sorry kiddo, that empty boasting won't help much, your previous post clearly suggests you have no clue what TPU is, and have not read that seminal paper.

You are welcom to find my paper when you learn google

Milos wrote: (dp) floating point for training.
Why, I leave you to figure it out (that is pretty basic stuff btw.).

So you actually think that double precision is used for training?

I leave you to figure out why dp isn't used in deep learning at all (this is known by anyone who ever done deep learning btw.

Milos wrote: You didn't demonstrate anything that I wrote was wrong, but just basically confirmed it and proved that you were wrong.

You claimed TPU is 180 TOPS, while it is 180 TFLOPS per pod of four TPU, thus your evaluation 4 times higher than it should be. You don't mention that P100/P40 support half precision FP, which provides twice as much FLOPS. Which puts us to 10 P100 to fit 4 TPUs. You claim V100 Tensor cores are shit, though I'm sure you have never used them. You claim V100 costs $10000, while it actualy costs $8000.

Milos wrote: It is clear from your writing you are some kiddo (probably got hold of his first ML course, or even Google intership and is now overexcited and as most of the youth full of himself), so I wouldn't hold you in a discussion any more.
When you have actually some substance to write about, then you can come back.

It's clear that you have no arguments and thus switch to ad hominem. You've also demonstrated that you know about deep learning a bit less than nothing.

supersharp77 · Post by **supersharp77** » Thu Dec 07, 2017 8:29 pm

Note:To whom it may concern...We refuse to accept the conclusions of this "so called" defeat of Stockfish 8 by 'Alpha Zero' (a program no one seems to have heard of before this so called "result was published") If Google wants the chess engine community to accept these "conclusions" follow the 'RULES'
of CHESS ENGINE TESTING..and Enter 'ALPHA ZERO" in TCEC or some other
Recognized chess engine tournament...and let the Chips fall where they mayl....until then....The "Google Teams" results are 100% meaningless!!

EvgeniyZh · Post by **EvgeniyZh** » Thu Dec 07, 2017 8:29 pm

Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?

Ok, so make a thought experiment. You may even actually perform it to get the point. Take two instances of Stockfish. Run a tournament between them, that'd emerge rating for both, which would be around 0 if you made everything right. Now increase the time control. Repeat. You'll still get zero, if you are still doing things right. Would that mean that there wasn't improvement? Do you get the concept now?

P.S. you also didn't even read what's written on the figure itself, did you?)

Milos · Post by **Milos** » Thu Dec 07, 2017 8:37 pm

EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?
Ok, so make a thought experiment. You may even actually perform it to get the point. Take two instances of Stockfish. Run a tournament between them, that'd emerge rating for both, which would be around 0 if you made everything right. Now increase the time control. Repeat. You'll still get zero, if you are still doing things right. Would that mean that there wasn't improvement? Do you get the concept now?

Y-scale in Fig.2 means Elo without added reference point (that is what that clumsy term google used actually means). It doesn't mean Elo difference between 2 programs, because if it was Elo difference you'd need a single curve, not 2 of them (it's pretty obvious if you just used your head a bit).
Beside not understanding much, you really don't seem too bright.
Sorry, but I don't want to waste any more time with you, it's pointless.

lkaufman · Post by **lkaufman** » Thu Dec 07, 2017 8:38 pm

Lyudmil Tsvetkov wrote:In the 10-games sample, which is freely accessible, I see the following:

Game 1 and 2 feature this position:
[d]r1bqk2r/ppp2ppp/2p2n2/2b1p3/4P3/3P1N2/PPP2PPP/RNBQK2R w KQkq - 0 6

SF has traded bishop for knight and on move 6, already has worse position.

Game 3:

[d]rn1qkb1r/p2p1ppp/bp2pn2/2pP4/2P5/5NP1/PPQ1PP1P/RNB1KB1R b KQkq - 0 6

On move 6, SF already is much worse, if not lost at all.

Game 4:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is already considerably worse.

Games 5 and 6:

[d]rn1q1rk1/pbppbppp/1p2pn2/3P4/2P5/5NP1/PP2PPBP/RNBQ1RK1 b - - 0 7

On move 7, SF is much much worse.

Game 9:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is considerably worse.

Game 10: repetition of games 5 and 6

So, actually, only games 7 and 8 featured more balanced opening, all the rest was decided very early into the opening, with Alpha having trained the opening on human games. SF, on the other hand, does not rely on human opening knowledge, which is much superior.

So that my assessment 80% of the games were decided by the in-built opening knowledge is fully correct.

With that, my assessment is that the lack of openin gbook was the biggest disadvantage to SF.

It was basically an openings book match.

All of those positions are ones normally played by Grandmasters and considered to offer White no more than his normal opening edge. Maybe the French defense is a little better for White than say the Berlin. I'm pretty sure that Alpha zero would have held the draw playing the other side of them. The gambits played in the Queen's Indian are tricky, but not objectively much better for White.

clumma · Post by **clumma** » Fri Dec 08, 2017 1:50 am

Milos wrote:Y-scale in Fig.2 means Elo without added reference point (that is what that clumsy term google used actually means). It doesn't mean Elo difference between 2 programs, because if it was Elo difference you'd need a single curve, not 2 of them (it's pretty obvious if you just used your head a bit).

The text clearly states the Elo baseline (relative Elo = 0) for Fig. 2 is the respective traditional program with 40ms thinking time.

Beside not understanding much, you really don't seem too bright. Sorry, but I don't want to waste any more time with you, it's pointless.

Actually it's obvious to readers of this thread that while Evgeniy stated simple facts, almost everything you've written is incorrect.

-Carl

Gusev · Post by **Gusev** » Fri Dec 08, 2017 2:07 am

So, basically, Stockfish 8 lost to Giraffe on steroids.

Jesse Gersenson wrote:
clumma wrote:A truly stunning result. Matthew Lai is a coauthor!

https://arxiv.org/pdf/1712.01815.pdf

-Carl
Wa to go Matthew!

Adam Hair · Post by **Adam Hair** » Fri Dec 08, 2017 2:40 am

EvgeniyZh wrote: You claimed TPU is 180 TOPS, while it is 180 TFLOPS per pod of four TPU, thus your evaluation 4 times higher than it should be.

From what I can find, Google seems to be calling a motherboard containing 4 ASICs a Cloud TPU, and a TPU pod is 64 Cloud TPUs.

Milos · Post by **Milos** » Fri Dec 08, 2017 2:48 am

clumma wrote:
Milos wrote:Y-scale in Fig.2 means Elo without added reference point (that is what that clumsy term google used actually means). It doesn't mean Elo difference between 2 programs, because if it was Elo difference you'd need a single curve, not 2 of them (it's pretty obvious if you just used your head a bit).
The text clearly states the Elo baseline (relative Elo = 0) for Fig. 2 is the respective traditional program with 40ms thinking time.

Your comment that has absolutely nothing to do with my actual claim (about obviously wrong scaling for SF in that figure) or Evgeniy's pretty absurd claim (that figure represents basically derivative of Elo performance) but it is instead stating totally irrelevant point regarding Elo baseline (Elo reference point = 0) tells basically that you didn't at all understand the argument, claims or anything that has been discussed but only felt the urge to jump in and demonstrate your "smartness" by "proving" me wrong.
Congratulations man, you got me.
Now you can go stare at the sun, count stars or do whatever "smart" thing you usually do.

Adam Hair · Post by **Adam Hair** » Fri Dec 08, 2017 2:59 am

Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not?
Ok, so make a thought experiment. You may even actually perform it to get the point. Take two instances of Stockfish. Run a tournament between them, that'd emerge rating for both, which would be around 0 if you made everything right. Now increase the time control. Repeat. You'll still get zero, if you are still doing things right. Would that mean that there wasn't improvement? Do you get the concept now?
Y-scale in Fig.2 means Elo without added reference point (that is what that clumsy term google used actually means). It doesn't mean Elo difference between 2 programs, because if it was Elo difference you'd need a single curve, not 2 of them (it's pretty obvious if you just used your head a bit).
Beside not understanding much, you really don't seem too bright.
Sorry, but I don't want to waste any more time with you, it's pointless.

If you are going to be exceedingly arrogant, at least make sure you do not make mistakes. The 2 curves represent the Elo difference of SF 8 and of AlphaZero from SF 8 at 40ms/move as the thinking time varies.

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero Supposedly Beat Stockfish 8

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: Chess content and openings

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo