AlphaGo Zero And AlphaZero, RomiChess done better

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Michael Sherwin
Posts: 3196
Joined: Fri May 26, 2006 3:00 am
Location: WY, USA
Full name: Michael Sherwin

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by Michael Sherwin »

zenpawn wrote:
Michael Sherwin wrote:
zenpawn wrote:
Michael Sherwin wrote:That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist.
My understanding is its training was only via self-play starting from a blank slate, i.e., knowing only the rules.
A quote from one of Milos post.

"When starting from each human opening,
AlphaZero convincingly defeated Stockfish, suggesting that it has indeed mastered a wide spectrum of chess play."

This is evidence of pre match training against SF. How many human opening positions were trained against? Here is more of the quote.

"Finally, we analysed the chess knowledge discovered by AlphaZero. Table 2 analyses the
most common human openings (those played more than 100,000 times in an online database of human chess games"

So we not only have pre match training against SF but they used the most common human played positions to conduct that training.

So my original observation based on my experience with reinforcement learning that they must've used a human database and pre training against SF appears to be quite accurate.
I took those to be games played after the self-play training or at least not used to learn. The thing is called Zero for the very reason that it doesn't start with a database of games.
Then the first quote is a poorly constructed sentence as it clearly says that A0 defeated SF from EACH human opening. The second quote defines what is meant by each human opening. It is every position that occurred at least 100,000 times in a human online database. So my question is, were all those human opening positions covered in the 100 game match? Not even close! So unless the first quote is just poor sentence construction then A0 played training matches from all opening positions that were in an online database 100,000 times or more.

But if you understand that sentence to mean something different then just go with that! But for me it does not change what the sentence actually says. And that is not my fault. They should clarify the issue.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
zenpawn
Posts: 349
Joined: Sat Aug 06, 2016 8:31 pm
Location: United States

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by zenpawn »

Michael Sherwin wrote: But if you understand that sentence to mean something different then just go with that! But for me it does not change what the sentence actually says. And that is not my fault. They should clarify the issue.
Agreed, perhaps some room for interpretation. We'll have to wait and see how the final paper looks after review.
Tobber
Posts: 379
Joined: Fri Sep 28, 2012 5:53 pm
Location: Sweden

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by Tobber »

Michael Sherwin wrote:
zenpawn wrote:
Michael Sherwin wrote:
zenpawn wrote:
Michael Sherwin wrote:That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist.
My understanding is its training was only via self-play starting from a blank slate, i.e., knowing only the rules.
A quote from one of Milos post.

"When starting from each human opening,
AlphaZero convincingly defeated Stockfish, suggesting that it has indeed mastered a wide spectrum of chess play."

This is evidence of pre match training against SF. How many human opening positions were trained against? Here is more of the quote.

"Finally, we analysed the chess knowledge discovered by AlphaZero. Table 2 analyses the
most common human openings (those played more than 100,000 times in an online database of human chess games"

So we not only have pre match training against SF but they used the most common human played positions to conduct that training.

So my original observation based on my experience with reinforcement learning that they must've used a human database and pre training against SF appears to be quite accurate.
I took those to be games played after the self-play training or at least not used to learn. The thing is called Zero for the very reason that it doesn't start with a database of games.
Then the first quote is a poorly constructed sentence as it clearly says that A0 defeated SF from EACH human opening. The second quote defines what is meant by each human opening. It is every position that occurred at least 100,000 times in a human online database. So my question is, were all those human opening positions covered in the 100 game match? Not even close! So unless the first quote is just poor sentence construction then A0 played training matches from all opening positions that were in an online database 100,000 times or more.

But if you understand that sentence to mean something different then just go with that! But for me it does not change what the sentence actually says. And that is not my fault. They should clarify the issue.
How is it possible you can't read the published paper? It clearly says that A0 defeated Stockfish in 100 games on certain conditions, i.e 1 minute per move and so on. They also matched A0 against Stockfish with 100 games on each of the so called human openings. Time per move for this games are unknown but likely much shorter. It's obvious from table 2 that it's 100 games per opening. Why this should be considered pre-training against Stockfish is certainly not obvious, especially if it took place after the 100 games main match. In the paper it's mentioned after the main match but I guess it's more interesting with some conspiracy theory.

/John
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by corres »

[quote="Michael Sherwin"]

I did not read any white papers on A0. I only read some reports by journalist. All I was trying to do was demystify somewhat the phenomenon that is A0.

[/quote]

Nothing else read white papers so you do it well!
Thanks for it.
Tobber
Posts: 379
Joined: Fri Sep 28, 2012 5:53 pm
Location: Sweden

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by Tobber »

corres wrote:
Michael Sherwin wrote:
I did not read any white papers on A0. I only read some reports by journalist. All I was trying to do was demystify somewhat the phenomenon that is A0.
Nothing else read white papers so you do it well!
Thanks for it.
Sorry, didn't mean anything from you, my opinion is that Mr Sherwin is talking BS.

/John
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by corres »

[quote="Tobber"]

....I guess it's more interesting with some conspiracy theory.
/John

[/quote]


Conspiracy PRACTICE always was, is and will be.
Particularly if a huge amount of money depend on it.
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by corres »

If you have exact knowledge about AlphaZero - more than a journalist have - please divide them with us.
Mr. Sherwin has a view based on his learning and his practice and I thank him for dividing us.
Tobber
Posts: 379
Joined: Fri Sep 28, 2012 5:53 pm
Location: Sweden

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by Tobber »

corres wrote:If you have exact knowledge about AlphaZero - more than a journalist have - please divide them with us.
Mr. Sherwin has a view based on his learning and his practice and I thank him for dividing us.
I can read the published paper, why don't you do the same?

/John
corres
Posts: 3657
Joined: Wed Nov 18, 2015 11:41 am
Location: hungary

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by corres »

It is pity, but "white papers" does not give to public, even if it is a scientific public. Know-how, patent, trade secret, details of system working, etc. are not the subject of any public papers.
zenpawn
Posts: 349
Joined: Sat Aug 06, 2016 8:31 pm
Location: United States

Re: AlphaGo Zero And AlphaZero, RomiChess done better

Post by zenpawn »

From the paper, "Starting from random play, and given no domain knowledge except the game rules,...". And: "The AlphaZero algorithm is a more generic version of the AlphaGo Zero algorithm that was first introduced in the context of Go. It replaces the handcrafted knowledge and domain-specific augmentations used in traditional game-playing programs with deep neural networks and a tabula rasa reinforcement learning algorithm."