Historic Milestone: AlphaZero

Dan Cooper · Post by **Dan Cooper** » Wed Dec 06, 2017 3:34 pm

Jouni wrote:Sounds like april fool. But isn't !. I just read, that they are not interested in chess? But I read somewhere: So basically they played 400 cores engine vs 60 cores engine. Hmm..

Here come the excuses.

MikeGL · Post by **MikeGL** » Wed Dec 06, 2017 3:36 pm

Lyudmil Tsvetkov wrote:Amidst all the hullabaloo, no one is noticing the first moves both engines have been playing.

Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4

Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.

I observed nothing special in the way Alpha plays in these 10 games, it is going for the same open positions as most engines, just that outcalculates Stockfish consistently.
This could be due only to substantial hardware advantage.

Or maybe since AlphaZero don't use alpha-beta search, it's Monte-Carlo Tree Search (MTCS) algorithm could really be more superior. But your point is valid, both should be tested with equal hardware.

See that game where SF8 was forced to trap its Q at h8, looks like a zugzwang puzzle which is beyond SF8 prowess.

Milos · Post by **Milos** » Wed Dec 06, 2017 3:50 pm

smatovic wrote:
Since no one of us has access to TPU it's only fair to count in terms of what is available (for example if we had Alpha0 x64 binary compile and wanted to run it at home).
1TPU ~ 30xE5-2699v3 (18 cores machine).
4TPUs ~ 2000 Haswell cores
Apples and Bananas,

Stockfish is not able to make use of these TPUs,
and AlphaZero depends probably heavily on floating point operations (maybe half precision) to query the neural network.

So the question might be, if an stripped down x86-64 version of AlphaZero, with only some hundred or thousand of nps, is still able to beat Stockfish.....dunno.

--
Srdja

No original paper is comparing apples and bananas. SF is running on general purpose hardware. TPUs are not commercially available so running alpha0 on TPUs is giving it huge unfair advantage.
It would be like running SF on special hardware where search is happening on conventional CPU and all evaluation is handled with hundreds if not thousands of FPGAa, something like DeepBlue. Then we could say that comparison is fair.
Even in this setup, if it was the most recent version of Brainfish (so with opening book), and normal TC like 40/40 not 1move/min, Alpha0 would probably loose.

tralala · Post by **tralala** » Wed Dec 06, 2017 3:54 pm

Wow! I was wondering about an AlphaZero-Chess version and here it is.
It looks like their MTCS-NN combo is really powerful for a lot of games.

As others have mentioned the test conditions are far from ideal for CPU-Chess-Programs:

- the fixed 1minute/move takes away probably 20-30 Elo time-management-benefits
- 1GB of Hash for 64 Threads is (very) suboptimal!
- 64 Threads could be detrimental to playing strength if not tuned carefully

Other things are a bit unclear:

- which hardware was used for Stockfish (optimized binary?)
- which opening book was used

but still - impressive!

I would love to see an open challenge for the Stockfishteam (or Houdini perhaps?) when they play with classical time control und the CPU-Team can prepare their hardware, book choice and settings independently.

jdart · Post by **jdart** » Wed Dec 06, 2017 4:00 pm

There is a chart indicating that the relative advantaqe of AlphaGo actually goes up with thinking time, but it does need to be extended to long time controls.

Note also that, although Stockfish hardware was not optimal, AlphaGo was playing "4 TPUs," whatever that is, but I don't think that is big hardware.

So, not the best test, but still, neural network trained programs and programs trained with reinforcement learning have not so far been near Stockfish level.

The other thing notable here though is that training was an extremely compute-intensive task.

--Jon

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 06, 2017 4:05 pm

Steve Maughan wrote:I'm gobsmacked.

There's a bit of me that's skeptical, but since they've cracked Go I guess it's legitimate.

When will the UCI version be available? I need to play with it.

It is much weaker than SF.
And Go is much much simpler than chess what concerns evaluation. Maybe around 1000 times simpler. Go was perfect for tuning just a handful of terms, chess is a quite a different story.

Chess will not be that easy to crack.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 06, 2017 4:08 pm

jdart wrote:There is a chart indicating that the relative advantaqe of AlphaGo actually goes up with thinking time, but it does need to be extended to long time controls.

Note also that, although Stockfish hardware was not optimal, AlphaGo was playing "4 TPUs," whatever that is, but I don't think that is big hardware.

So, not the best test, but still, neural network trained programs and programs trained with reinforcement learning have not so far been near Stockfish level.

The other thing notable here though is that training was an extremely compute-intensive task.

--Jon

So no one actually knows how many standard cores 1/4 TPUs are?

jdart · Post by **jdart** » Wed Dec 06, 2017 4:16 pm

I don't think the exact equivalency is known, but apparently (from the thread in the Programming section), each TPU is a hefty custom processing unit. So AlphaGo probably did not have a hardware handicap here.

--Jon

MikeGL · Post by **MikeGL** » Wed Dec 06, 2017 4:16 pm

Lyudmil Tsvetkov wrote:
Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4
Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.

I think Table 2 [ECO opennings] in the PDF would answer your argument.
All those 12 common opennings (on that Table 2) was played by AlphaZero against SF8, 100 times each. and only a total of 4 losses (out of 300 games) as white starting with 1.e4 (for AlphaZero) as shown on that table.

Milos · Post by **Milos** » Wed Dec 06, 2017 4:24 pm

MikeGL wrote:
Lyudmil Tsvetkov wrote:
Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4
Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.
I think Table 2 [ECO opennings] in the PDF would answer your argument.
All those 12 common opennings, on that Table 2, was played by AlphaZero against SF8, 100 times each. and only a total of 4 losses (out of 60 games) as white starting with 1.e4 (for AlphaZero) as shown on that table.

3 in Sicilian and one in Reti, that is pretty indicative. Also by far the worst percentage of Alpha0 vs SF in those openings.

Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Much weaker than Stockfish

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish