AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Dec 08, 2017 6:22 pm

tmokonen wrote:
Lyudmil Tsvetkov wrote:Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.
It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.
Just another crappy, misinformed, false bravado post from a guy who quit his job to pursue the quixotic dream of finding the perfect old-school end point evaluation for an alpha beta searcher. He can't accept the fact that his years of painstaking effort have been rendered moot by a project that was just a "meh, let's spend a few hours and see what happens" lark by the team that already conquered Go, a much more complex game than chess.

http://davidsmerdon.com/?p=1970

Go is much simpler than chess, Go's evaluation patterns are exponentially fewer(1000/1) than those in chess.
So you are really very bad at basic knowledge and etiquette, lad.

Alpha is 1850 currently, and will stay like that, a weak engine running on tremendous hardware.

My project will still conquer the world.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Dec 08, 2017 6:24 pm

kranium wrote:
Lyudmil Tsvetkov wrote:
chessmobile wrote:Seems to play a mean game of chess. The endgames is where it excels. Many games looked equal to the naked eye but Alpha went on to win. If this thing follows the Go project then expect in a few months a monster that will beat it's current version quite easily.
Again, 80% of games were already decided in the early opening.
Due to the opening book Alpha unfairly used.
What opening book?

Don't you see that it plays Bg2 and the Advanced French?
Was not its opening algorithm trained on myriads of top human games?

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Dec 08, 2017 6:27 pm

Jhoravi wrote:Do you all think that AlphaZero from scratch only knows the rules of chess and nothing more? How about each piece values like 1 for pawn and 3 for knight etc? I would be so amazed if it learned all the pieces values from self play.

Of course it knows about piece values and psqt for each piece and pawns, they just don't state it.
It started from a much higher knowledge base, if you ask me.

One insight I got from browsing the games is that, indeed, they rely a lot on psqt.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Dec 08, 2017 6:32 pm

BeyondCritics wrote:
cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..
You lost that bet
AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.
...
At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.

This is done for training, not for play, so it is not actually a search, but a tuning method.
Why everyone confuses tuning with search?
They are still doing alpha-beta.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Fri Dec 08, 2017 6:41 pm

lkaufman wrote:
Lyudmil Tsvetkov wrote:In the 10-games sample, which is freely accessible, I see the following:

Game 1 and 2 feature this position:
[d]r1bqk2r/ppp2ppp/2p2n2/2b1p3/4P3/3P1N2/PPP2PPP/RNBQK2R w KQkq - 0 6

SF has traded bishop for knight and on move 6, already has worse position.

Game 3:

[d]rn1qkb1r/p2p1ppp/bp2pn2/2pP4/2P5/5NP1/PPQ1PP1P/RNB1KB1R b KQkq - 0 6

On move 6, SF already is much worse, if not lost at all.

Game 4:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is already considerably worse.

Games 5 and 6:

[d]rn1q1rk1/pbppbppp/1p2pn2/3P4/2P5/5NP1/PP2PPBP/RNBQ1RK1 b - - 0 7

On move 7, SF is much much worse.

Game 9:

[d]rnbqkb1r/pppn1ppp/4p3/3pP3/3P1P2/2N5/PPP3PP/R1BQKBNR b KQkq f3 0 5

On move 5, SF is considerably worse.

Game 10: repetition of games 5 and 6

So, actually, only games 7 and 8 featured more balanced opening, all the rest was decided very early into the opening, with Alpha having trained the opening on human games. SF, on the other hand, does not rely on human opening knowledge, which is much superior.

So that my assessment 80% of the games were decided by the in-built opening knowledge is fully correct.

With that, my assessment is that the lack of openin gbook was the biggest disadvantage to SF.

It was basically an openings book match.
All of those positions are ones normally played by Grandmasters and considered to offer White no more than his normal opening edge. Maybe the French defense is a little better for White than say the Berlin. I'm pretty sure that Alpha zero would have held the draw playing the other side of them. The gambits played in the Queen's Indian are tricky, but not objectively much better for White.

That is just the human perception.
I have followed all those positions very very deep with Stockfish and your Komodo.
I am not certain Alpha would have held, probably not.

I don't know how you can call a position, where one side fianchettoes its king side bishop, and the other not, equal; certainly the side with the fianchetto has the advantage, especially if it is white.
Similarly for the French, the advantage of white is huge, in almost all setups involving an advanced e5 pawn, and you should know that perfectly, even only because Komodo has a very favourable score with SF in this opening, even in TCEC.
Ceding the pair of bishops in the Ruy Lopez, when the light-square bishop on b5 has not even been threatened by an a6 pawn, to gain tempo, is suspect, to say the least. Fischer sometimes employed the Exchange, but always after an a7-a6 kick.

Uri Blass · Post by **Uri Blass** » Fri Dec 08, 2017 6:46 pm

Lyudmil Tsvetkov wrote:
tmokonen wrote:
Lyudmil Tsvetkov wrote:Eelco, how can you buy into the SCAM too?
The hardware advantage was 50/1.
It plays 1850-elo chess on a single core.

Above diagram is already way way won for black; Stockfish blundered already in the opening with Nce5, this is already lost.

Alpha beating me? Gosh, I will shred it to pieces.
It understands absolutely nothing of closed positions, no such were encountered in the sample.

It is all about the hardware, 2 or 3 beautiful games, with the d4-e5-f6 chain outperforming a whole black minor piece, one great attack on the bare SF king and one more, all the rest is just exceding computations.
Nothing special about its eval.
Just another crappy, misinformed, false bravado post from a guy who quit his job to pursue the quixotic dream of finding the perfect old-school end point evaluation for an alpha beta searcher. He can't accept the fact that his years of painstaking effort have been rendered moot by a project that was just a "meh, let's spend a few hours and see what happens" lark by the team that already conquered Go, a much more complex game than chess.
http://davidsmerdon.com/?p=1970

Go is much simpler than chess, Go's evaluation patterns are exponentially fewer(1000/1) than those in chess.
So you are really very bad at basic knowledge and etiquette, lad.

Alpha is 1850 currently, and will stay like that, a weak engine running on tremendous hardware.

My project will still conquer the world.

1850 engine cannot beat stockfish regardless of hardware.

Even if you assume 100 elo per doubling then you need to be 1000 times faster only to get 2850 that is clearly weaker than top programs and Alpha certainly did not have 1000:1 hardware advantage.

CheckersGuy · Post by **CheckersGuy** » Fri Dec 08, 2017 7:03 pm

Stop feeding the troll

IanO · Post by **IanO** » Fri Dec 08, 2017 7:37 pm

Lyudmil Tsvetkov wrote:
BeyondCritics wrote:
cdani wrote:
Will be very interesting to know which was the typical deep achieved by AlphaZero. I bet that much less than Stockfish..
You lost that bet
AlphaZero uses MCTS https://en.wikipedia.org/wiki/Monte_Carlo_tree_search. From this source https://www.arxiv-vanity.com/papers/1712.01815v1/:

Instead of an alpha-beta search with domain-specific enhancements, AlphaZero uses a general-purpose Monte-Carlo tree search (MCTS) algorithm. Each search consists of a series of simulated games of self-play that traverse a tree from root to leaf.
...
At the end of the game, the terminal position is scored according to the rules of the game to compute the game outcome -1 for a loss, 0 for a draw, and +1 for a win.
This is done for training, not for play, so it is not actually a search, but a tuning method.
Why everyone confuses tuning with search?
They are still doing alpha-beta.

Wrong. Unless stated otherwise, the player is identical to the trainer. That is one of the reasons this is such an interesting advance, both a new eval and tuning method and a previously discarded search method.

tmokonen · Post by **tmokonen** » Fri Dec 08, 2017 8:20 pm

I had posts removed for saying the same thing. Seems that the mods don't like honesty.

Ras · Post by **Ras** » Fri Dec 08, 2017 8:45 pm

Lyudmil Tsvetkov wrote:Would not SF on the very same hardware, if adapted, be still 400 elos stronger?

No, it wouldn't, for the same reason that Stockfish doesn't harness the power of available GPUs. These TPUs are quite a different design than CPUs. TPUs are a bit like GPUs modified in hardware to be more efficient in neural networks instead of graphics.

The big thing here is that they have a truck load of simple modules than can perform identical operations on different data at the same time. Perfect for neural networks - and for graphics, which is why graphic hards have long been used for neural networks purposes.

By contrast, conventional CPUs are good at performing different and complex operations on input data at a high rate. That is good for if/then/else-branching, which chess engines essentially do.

Software that was designed for CPUs cannot take advantage of TPUs. The TPU hardware would be useless for Stockfish. The other way round is also not that promising: trying to run a neural network on a CPU doesn't yield performant results.

AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: Chess content and openings

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo

Re: AlphaZero beats AlphaGo Zero, Stockfish, and Elmo