AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by Milos »

EvgeniyZh wrote:The info on TPUs is vague, but it's said to have ~45 TFLOPs (half precision probably). For example see here. That would mean that AlphaZero ran 180 TFLOPs system. It's believed 1080 Ti is kinda cost-optimal for DL, and you'd need 16-18 of them to match performance (you may round up to 20). That's not what you'd put at home, but many DL researchers have that amount of resources. I'd roughly approximate it around $60k for the whole thing, give or take. With next generation GPU you probably can fit the whole thing in one node.
lkaufman wrote: The other conditions were of course not "fair", but reasonable given that AlphaZero only trained for a few hours. I suppose if Stockfish used a good book, was allowed to use its time management as if the time limit were pure increment, and used the latest dev. version, the match would have been much closer, but probably (judging by the infinite win to loss ratio and the actual games) SF would have still lost. The games were amazing.

Bottom line, assuming the comparable cost claim is accurate: If Google wants to optimize the software for a few weeks and sell it, rent it, or give it away, we have a revolution in computer chess. But my guess is that they won't do this, in which case the revolution may be delayed a couple years or so.
First you have to understand what TPU is. There is enough material on that, published by no one else but Google.
https://arxiv.org/abs/1704.04760
Second it is not 45 TFLOPs but 92 TOPS and that is first generation TPU. They don't say explicitly in the paper which generation TPU they used for inference (they say it just for training) but logic kind of tells us second generation is more probable.
Second generation TPUs performance is 180 TOPS.
It is int8 multiplication not single or double floating point precision operations you are used to from common GPUs and NVIDIA in general and it is certainly not tensor FLOPS (stupid marketing term by NVIDIA that has zero meaning in reality).
V100 has 15 TFLOPS single precision, that is the most you can get if you use single precision floating point as a replacement for integer multiplication. So you would need 6 V100 for one first generation TPU, and 12 for second generation one.
Alpha0 used 4 TPUs for running games, so at best 24 V100, at worst 48 V100.
V100 will at best cost 10k$, or 250k$ of half a million bucks just to run alpha0, and you think there would be chess enthusiasts to afford it???

And give me a break with theoretical GP102 performance (1080Ti). I work with them for ML and that is pure BS, so much BS that NVIDIA actually never published the figure, but instead what ppl compute as num_cores x frequency x 2 which is totally detached from reality.
In reality if you run int multiplications on it you'd see the performance is not even 1 TOPS (for int multiplication).
You think NVIDIA is so stupid to sell V100 for >10k$ offering almost the same performance as 1080Ti that costs 600$???
IanO
Posts: 496
Joined: Wed Mar 08, 2006 9:45 pm
Location: Portland, OR

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by IanO »

The computer shogi community is expressing similar concerns about the selection of opponent (Elmo), engine configuration, and match methodology:
Some concerns on the matching conditions between AlphaZero and Shogi engine

December 6, 2017

After the publication of the paper (D. Silver et. al. "Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm", arXiv:1712.01815), there appeared a few concerns from the community of computer shogi programmers on the matching conditions between AlphaZero and Shogi engine “elmo”. Here I summarize the points with some explanations. (Informations will be updated if error is found).

1. Resignation point seems too narrow. In the recent software, the evaluation tends to give larger value compared with the chess programs. Many people feel that -900 centipawns is too small for Shogi programs. I guess that the acceptable value would be -3000 to -5000. In the official matches such as World Computer Shogi Competition (http://www2.computer-shogi.org/index_e.html), they do not set the resignation point and wait until the program resigns. After 256 plies, the game is judged as "draw" even if the evaluation is one-sided.

2. It is strange to set "EnteringKingRule" to "NoEnteringKing". In the recent matches between shogi software, Entering King frequently occurs and the treatment is critical to the match results. When both kings enter the the other's territory, Yaneuraou counts the number of pieces and declare to win if it has enough pieces. It is not clear if AlphaZero has this functionality. I guess that it will be preferrable to be set to the default "CSARule27".

3. Hash size may be too low and tricky. In YaneuraOu 2017 Early, there are two setting on Hash size. One is "Hash" which is set to 16 MB by default and "USI Hash" whose default value 1024MB. In YaneuraOu, the latter value is not used and the former one is important. If "Hash" is kept to the default value, I observe that program becomes very weak. In the matching condition (35MNode per move), even 1GB may be too low. It will be more appropriate if it is set to bigger value.

Finally I would like to mention that 2017 is a dog year for shogi engines and we have plenty of programs which are much stronger than elmo. For instance, the winner program "Heisei shogi gassen ponpoko" ("ponpoko" in short) in Shogi Denno Tounament (http://denou.jp/tournament2017/), overrates elmo by R150. This program is available at https://github.com/nodchip/hakubishin-/releases as "tanuki-sdt5-2017-11-16". It is also known that Apery_sdt5 has even stronger evaluation file (available at https://t.co/S7q7XlW4dG), (R200 stronger than elmo). Currently the strongest evaluation file is "aperypaq" which is an improvement of Apery_sdt5 (available at http://qhapaq.hatenablog.com/entry/2017/11/28/195426). (R250 stronger than elmo). These should be combined with YanuraOu. I hope that the authors may test these programs before declaring AlphaZero beats currently available shogi programs.
Source: http://www.uuunuuun.com/single-post/201 ... ogi-engine
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by Milos »

EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
EvgeniyZh
Posts: 43
Joined: Fri Sep 19, 2014 4:54 pm
Location: Israel

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by EvgeniyZh »

Milos wrote: First you have to understand what TPU is. There is enough material on that, published by no one else but Google.
https://arxiv.org/abs/1704.04760
I do know enough about TPU, and even developed similar system by myself
Milos wrote: Second it is not 45 TFLOPs but 92 TOPS and that is first generation TPU.
Exactly, while first-gen works with INT8, second gen works with floating point, and give 180 TFLOPS per 4 TPU pod: https://www.blog.google/topics/google-c ... -learning/
Milos wrote: Second generation TPUs performance is 180 TOPS.
TFLOPS, per pod (4 TPUs)
Milos wrote:It is int8 multiplication not single or double floating point precision operations you are used to from common GPUs and NVIDIA in general
NVIDIA has GPUs supporting half precision FP and INT8.


So basically every word you are writing is misleading BS which has nothing to do with truth, since you either not ready to spend two minutes googling (pun intended) or just jealous and try to mislead others.
EvgeniyZh
Posts: 43
Joined: Fri Sep 19, 2014 4:54 pm
Location: Israel

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by EvgeniyZh »

Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
jefk
Posts: 626
Joined: Sun Jul 25, 2010 10:07 pm
Location: the Netherlands
Full name: Jef Kaan

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by jefk »

had a quick look, and it confirms my findings in the Infinity Chess Tourn
earlier this year, where i could beat stockfish sometimes as 'centaur' (using Houdini 4
and Komodo10 on two comps).
Also, the opening book used by Stockfish in this match was weak
if i compare it with my own knowledge as correspondence player (accumulated with
some 25+yrs of experience in computerchess); and not commenting on Tsvetkov's ideas btw.
In line with the analysis as posted on the lichess site, Alfazero didn't
make weak opening moves (although it sometimes could have been better), whereas
Stockfish made such moves as .. Bb7 in Queens Indian instead of the modern ...Ba6 !
(although even then it's not easy (but still possible imho) to equalize.)
But even then, most wins were obtained in the transition between middle game and
endgame, a traditional Komodo monopoly i thought, until now.
Concluding, Alfazero has lots of potential, on stronger comps,
on my comp i prefer Houdini 6 and my own opening book,
combined with some human judgment sometimes and then the
results are still comparable with Alfazero i think.
So bottom-line question is whether Hassabis will continue this chess project,
being interested to find out how perfect chess games will look like.

jef

PS as for the question whether chess is a draw, ask him, i suggest,
about the results of Alfazero playing against itself after days/weeks
of training; he will ofcourse confirm that chess is a draw. (and then
it doesn't matter if you play d4 or e4, although for the time being
d4 and Nf3 give the best practical chances of course; with 1.c4
being a tricky move, if you don't know how to equalize for Black)

www.bookbuilder.nl
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by Milos »

EvgeniyZh wrote:
Milos wrote: First you have to understand what TPU is. There is enough material on that, published by no one else but Google.
https://arxiv.org/abs/1704.04760
I do know enough about TPU, and even developed similar system by myself
Sorry kiddo, that empty boasting won't help much, your previous post clearly suggests you have no clue what TPU is, and have not read that seminal paper.
Milos wrote: Second generation TPUs performance is 180 TOPS.
TFLOPS, per pod (4 TPUs)
Milos wrote:It is int8 multiplication not single or double floating point precision operations you are used to from common GPUs and NVIDIA in general
NVIDIA has GPUs supporting half precision FP and INT8.

So basically every word you are writing is misleading BS which has nothing to do with truth, since you either not ready to spend two minutes googling (pun intended) or just jealous and try to mislead others.
Single precision (or ints) are used for inference, (dp) floating point for training. Why, I leave you to figure it out (that is pretty basic stuff btw.).
4 BTUs that were used for inference are obviously used in single-precision FP or int mode, plus it was never stated in the paper which BTUs they used, so could have been also gen 1 (even though less probable).
You didn't demonstrate anything that I wrote was wrong, but just basically confirmed it and proved that you were wrong.

It is clear from your writing you are some kiddo (probably got hold of his first ML course, or even Google intership and is now overexcited and as most of the youth full of himself), so I wouldn't hold you in a discussion any more.
When you have actually some substance to write about, then you can come back.
Last edited by Milos on Thu Dec 07, 2017 7:58 pm, edited 1 time in total.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Chess content and openings

Post by Lyudmil Tsvetkov »

In the 10-games sample, which is freely accessible, I see the following:

Game 1 and 2 feature this position:
[d]r1bqk2r/ppp2ppp/2p2n2/2b1p3/4P3/3P1N2/PPP2PPP/RNBQK2R w KQkq - 0 6

SF has traded bishop for knight and on move 6, already has worse position.

Game 3:

[d]rn1qkb1r/p2p1ppp/bp2pn2/2pP4/2P5/5NP1/PPQ1PP1P/RNB1KB1R b KQkq - 0 6

On move 6, SF already is much worse, if not lost at all.

Game 4:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is already considerably worse.

Games 5 and 6:

[d]rn1q1rk1/pbppbppp/1p2pn2/3P4/2P5/5NP1/PP2PPBP/RNBQ1RK1 b - - 0 7

On move 7, SF is much much worse.

Game 9:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is considerably worse.

Game 10: repetition of games 5 and 6

So, actually, only games 7 and 8 featured more balanced opening, all the rest was decided very early into the opening, with Alpha having trained the opening on human games. SF, on the other hand, does not rely on human opening knowledge, which is much superior.

So that my assessment 80% of the games were decided by the in-built opening knowledge is fully correct.

With that, my assessment is that the lack of openin gbook was the biggest disadvantage to SF.


It was basically an openings book match.
Jesse Gersenson
Posts: 593
Joined: Sat Aug 20, 2011 9:43 am

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by Jesse Gersenson »

clumma wrote:A truly stunning result. Matthew Lai is a coauthor!

https://arxiv.org/pdf/1712.01815.pdf

-Carl
Wa to go Matthew!
Milos
Posts: 4190
Joined: Wed Nov 25, 2009 1:47 am

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Post by Milos »

EvgeniyZh wrote:
Milos wrote:
EvgeniyZh wrote:64 cores to 1 is not like 4096 to 64. Moreover even if it were, the dependency of strength of play vs TTD is also sublinear, and, certainly, is practically bounded. It is actually demonstrated in paper, page 7. Did you read it?
Figure 2 is completely bogus. If wish Google actually cited that reference that shows SFs Elo performance increase when going from 10s to 1min/move of under 20Elo.
So your are playing expert and even don't understand meaning of relative ELO?
Lol, I missed this pearl.
"Relative Elo" (btw. it's Elo not ELO), you just invented that, did you not? :lol: :lol: :lol: :lol: