Alpha Zero question

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Leo
Posts: 1107
Joined: Fri Sep 16, 2016 6:55 pm
Location: USA/Minnesota
Full name: Leo Anger

Re: Alpha Zero question

Post by Leo »

So maybe the big thing about Alpha Zero was its learning logarithm and its huge hardware.
Advanced Micro Devices fan.
syzygy
Posts: 5807
Joined: Tue Feb 28, 2012 11:56 pm

Re: Alpha Zero question

Post by syzygy »

supersharp77 wrote: Wed Feb 15, 2023 7:43 pm
Leo wrote: Tue Feb 14, 2023 8:17 pm I am a little out of the loop these day with the latest chess engine developments. How would Alpha Zero match up against the current Stockfish TCEC champion? Who invented Neural networks for chess? I forgot. It really was a revolution.
Well You will have to double back to the original 2018 match links/debates...there was quite alot of back and forth & push and pull about that "so called match" as no one was actually able to reproduce the results!
There was certainly a lot of back and forth, and there were many people complaining that nobody would be able to reproduce the results based on Deepmind's papers.

But while those people were complaining, others were busy reproducing the results based on Deepmind's papers.
What I recall (just like it was yesterday) was that Google....(Alpha Zeros Team) did not allow Stockfish a opening book and there was no "Stockfish Team" present during the so called "match" just a base Stockfish 8....."Match" was 100 games with 8 wins by Alpha Zero (98 draws?) but most were unpublished...and only the wins were published?!.......Alpha Zero at this present time (5 years later) remains a mystery...LC0 by most accounts may well be stronger than Alpha Zero currently (although opinions will vary) so until someone gets a working copy of Alpha Zero.......we will never know.....original arguments were CPU vs TPU....(Now GPU) and power supply issues and also time control & opening book issues...."Who Knows For Sure!!".... :) :wink:
Yes, from a pure "which engine is the strongest" point of view much was still unclear. But Deepmind had shown that their approach, which not only used NNs but amazingly did NOT use anything resembling alpha-beta, could compete with the very best engines. That it was anywhere near SF's strength was a miracle already.
User avatar
M ANSARI
Posts: 3734
Joined: Thu Mar 16, 2006 7:10 pm

Re: Alpha Zero question

Post by M ANSARI »

Leo wrote: Tue Feb 14, 2023 8:17 pm I am a little out of the loop these day with the latest chess engine developments. How would Alpha Zero match up against the current Stockfish TCEC champion? Who invented Neural networks for chess? I forgot. It really was a revolution.
I think the question should be how would SF today do without NNUE against Alpha Zero that beat SF8. That would be more interesting as SF with NNUE could be argued that it is sort of SF with Alpha Zero components. Personally I think that SF with NNUE is just an amazing combination and most likely a new LC Zero with some SF components would also be equally amazing. If you look at some of Alpha Zero games, there are some obvious glaring flaws that could be best described as "bugs" in software that was still in beta mode. The same flaws carried over to LC Zero ... but somehow I think LC 0 has ironed those things out. My guess would be that a new Alpha Zero would most likely be able to learn from the gains of LC 0 and play much stronger. Add to that the dramatic increase in AI hardware power and you would get a pretty impressive chess playing entity!
Werewolf
Posts: 2058
Joined: Thu Sep 18, 2008 10:24 pm

Re: Alpha Zero question

Post by Werewolf »

Leo wrote: Mon Feb 20, 2023 5:17 pm So maybe the big thing about Alpha Zero was its learning logarithm and its huge hardware.
IIRC the training hardware was big, achieved in 4 hours, but the playing hardware was only fairly big.
jkominek
Posts: 98
Joined: Tue Sep 04, 2018 5:33 am
Full name: John Kominek

Re: Alpha Zero question

Post by jkominek »

Werewolf wrote: Tue Feb 21, 2023 2:58 pm
Leo wrote: Mon Feb 20, 2023 5:17 pm So maybe the big thing about Alpha Zero was its learning logarithm and its huge hardware.
IIRC the training hardware was big, achieved in 4 hours, but the playing hardware was only fairly big.
Decided to look it up. From their 2017 paper that sent shockwaves through the computer chess community, Mastering chess and shogi by self-play with a general reinforcement learning algorithm:
Training proceeded for 700,000 steps (mini-batches of size 4,096) starting from randomly
initialised parameters, using 5,000 first-generation TPUs to generate self-play
games and 64 second-generation TPUs to train the neural networks.

re. Evaluation: AlphaZero and the previous AlphaGo Zero used a single machine with 4 TPUs.
A year later, in December 2018, after responding to the referees with more thorough testing, they put the title words through the jumbler and published A general reinforcement learning algorithm that masters chess shogi and go through self-play. The reported training specs changed on one point.
We trained separate instances of AlphaZero for chess, shogi and Go. Training proceeded for
700,000 steps (in mini-batches of 4,096 training positions) starting from randomly initialized
parameters. During training only, 5,000 first-generation tensor processing units (TPUs)
were used to generate self-play games, and 16 second-generation TPUs were used to train the
neural networks. Training lasted for approximately 9 hours in chess, 12 hours in shogi and 13
days in Go

re. Evaluation: AlphaZero and AlphaGo Zero used a single machine with four first-generation TPUs and 44 CPU cores.

Endnote 24. A first generation TPU is roughly similar in inference speed to a Titan V GPU, although
the architectures are not directly comparable.
It could be that the first claim of 64 2nd-gen TPUs for model training was a mis-statement. Or resources could have been reduced during the follow-up work. The learning progress curve for Chess (Figure 1) is identical in both papers, but different for Shogi and Go.

Using the comparison to a Titan V, 4 TPUs for game playing is roughly equivalent to two Nvidia A100s, the supply of GPU compute power currently available in the TCEC competition.