Historic Milestone: AlphaZero

Milos · Post by **Milos** » Wed Dec 06, 2017 3:50 pm

smatovic wrote:
Since no one of us has access to TPU it's only fair to count in terms of what is available (for example if we had Alpha0 x64 binary compile and wanted to run it at home).
1TPU ~ 30xE5-2699v3 (18 cores machine).
4TPUs ~ 2000 Haswell cores
Apples and Bananas,

Stockfish is not able to make use of these TPUs,
and AlphaZero depends probably heavily on floating point operations (maybe half precision) to query the neural network.

So the question might be, if an stripped down x86-64 version of AlphaZero, with only some hundred or thousand of nps, is still able to beat Stockfish.....dunno.

--
Srdja

No original paper is comparing apples and bananas. SF is running on general purpose hardware. TPUs are not commercially available so running alpha0 on TPUs is giving it huge unfair advantage.
It would be like running SF on special hardware where search is happening on conventional CPU and all evaluation is handled with hundreds if not thousands of FPGAa, something like DeepBlue. Then we could say that comparison is fair.
Even in this setup, if it was the most recent version of Brainfish (so with opening book), and normal TC like 40/40 not 1move/min, Alpha0 would probably loose.

tralala · Post by **tralala** » Wed Dec 06, 2017 3:54 pm

Wow! I was wondering about an AlphaZero-Chess version and here it is.
It looks like their MTCS-NN combo is really powerful for a lot of games.

As others have mentioned the test conditions are far from ideal for CPU-Chess-Programs:

- the fixed 1minute/move takes away probably 20-30 Elo time-management-benefits
- 1GB of Hash for 64 Threads is (very) suboptimal!
- 64 Threads could be detrimental to playing strength if not tuned carefully

Other things are a bit unclear:

- which hardware was used for Stockfish (optimized binary?)
- which opening book was used

but still - impressive!

I would love to see an open challenge for the Stockfishteam (or Houdini perhaps?) when they play with classical time control und the CPU-Team can prepare their hardware, book choice and settings independently.

jdart · Post by **jdart** » Wed Dec 06, 2017 4:00 pm

There is a chart indicating that the relative advantaqe of AlphaGo actually goes up with thinking time, but it does need to be extended to long time controls.

Note also that, although Stockfish hardware was not optimal, AlphaGo was playing "4 TPUs," whatever that is, but I don't think that is big hardware.

So, not the best test, but still, neural network trained programs and programs trained with reinforcement learning have not so far been near Stockfish level.

The other thing notable here though is that training was an extremely compute-intensive task.

--Jon

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 06, 2017 4:05 pm

Steve Maughan wrote:I'm gobsmacked.

There's a bit of me that's skeptical, but since they've cracked Go I guess it's legitimate.

When will the UCI version be available? I need to play with it.

It is much weaker than SF.
And Go is much much simpler than chess what concerns evaluation. Maybe around 1000 times simpler. Go was perfect for tuning just a handful of terms, chess is a quite a different story.

Chess will not be that easy to crack.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 06, 2017 4:08 pm

jdart wrote:There is a chart indicating that the relative advantaqe of AlphaGo actually goes up with thinking time, but it does need to be extended to long time controls.

Note also that, although Stockfish hardware was not optimal, AlphaGo was playing "4 TPUs," whatever that is, but I don't think that is big hardware.

So, not the best test, but still, neural network trained programs and programs trained with reinforcement learning have not so far been near Stockfish level.

The other thing notable here though is that training was an extremely compute-intensive task.

--Jon

So no one actually knows how many standard cores 1/4 TPUs are?

jdart · Post by **jdart** » Wed Dec 06, 2017 4:16 pm

I don't think the exact equivalency is known, but apparently (from the thread in the Programming section), each TPU is a hefty custom processing unit. So AlphaGo probably did not have a hardware handicap here.

--Jon

MikeGL · Post by **MikeGL** » Wed Dec 06, 2017 4:16 pm

Lyudmil Tsvetkov wrote:
Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4
Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.

I think Table 2 [ECO opennings] in the PDF would answer your argument.
All those 12 common opennings (on that Table 2) was played by AlphaZero against SF8, 100 times each. and only a total of 4 losses (out of 300 games) as white starting with 1.e4 (for AlphaZero) as shown on that table.

Milos · Post by **Milos** » Wed Dec 06, 2017 4:24 pm

MikeGL wrote:
Lyudmil Tsvetkov wrote:
Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4
Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.
I think Table 2 [ECO opennings] in the PDF would answer your argument.
All those 12 common opennings, on that Table 2, was played by AlphaZero against SF8, 100 times each. and only a total of 4 losses (out of 60 games) as white starting with 1.e4 (for AlphaZero) as shown on that table.

3 in Sicilian and one in Reti, that is pretty indicative. Also by far the worst percentage of Alpha0 vs SF in those openings.

Lyudmil Tsvetkov · Post by **Lyudmil Tsvetkov** » Wed Dec 06, 2017 4:28 pm

I just read somewhere that Tesla K80 GPU has 12 cores.
Also that the TPU is 20 times faster than K80, according to Google.

Stockfish used 64 cores, that makes 12x5 K80s.
Alpha used 4 TPUs, so actually, it results that Alpha had 20x4/5=16 times bigger hardware.

Is that true?

In case this is so, of what comparisons and matches we are talking?
By these figures, current Stockfish development is at least 300 elos stronger than Alpha.

Is not it a bit shameless to assert you have achieved something, when in actual fact it is all hardware.

Stockfish on tremendous hardware and suitable SMP algorithms will certainly thrash SF 64 cores 90/10 or so, so why all this hype?

MikeGL · Post by **MikeGL** » Wed Dec 06, 2017 4:29 pm

Milos wrote:
MikeGL wrote:
Lyudmil Tsvetkov wrote:
Alpha chooses only 1.d4 and 1.Nf3, while Stockfish goes for 1.e4
Judging from this, I can say that Alpha is much weaker than SF in terms of software, and the only reason for the win is the very big hardware advantage.
I think Table 2 [ECO opennings] in the PDF would answer your argument.
All those 12 common opennings, on that Table 2, was played by AlphaZero against SF8, 100 times each. and only a total of 4 losses (out of 60 games) as white starting with 1.e4 (for AlphaZero) as shown on that table.
3 in Sicilian and one in Reti, that is pretty indicative. Also by far the worst percentage of Alpha0 vs SF in those openings.

edit: Not 60 games. 4 losses out of 300 games

50 games as white (and 50 as black) x 6 types of 1.e4 on that Table, if I understood the table correctly.

Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Historic Milestone: AlphaZero

Re: Much weaker than Stockfish

Re: Much weaker than Stockfish

Re: Historic Milestone: AlphaZero

Re: Much weaker than Stockfish