I forgot to mention that using a 'CEGT' book will override Romi's book until the CEGT book is done then Romi will start using her book and learning after that.Michael Sherwin wrote:Hi Werner,Werner wrote:Hi Mike,
so if I include all CEGT games into the Folder, merge them and repeat a match played now I will get much better result?
- original games are played without learning
- now using learning on and repeat a match
I will try : but I do not think the engines uses this lear file.
Settings here
learn_on
book_off
quit
I do not know how to use the command in the help.txt inside Romi:
douselearn ?
best wishes
Werner
To have the learning fully enabled type learn_on then book_on then quit. Learning is part of the book structure. If learning was on all this time then the learn.dat file should be quite large by now. Just loading pgn files will give Romi a better result but for a best result Romi needs many games to personally have experience with the lines and start to select lines that are better for Romi.
AlphaGo Zero And AlphaZero, RomiChess done better
Moderators: hgm, Rebel, chrisw
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: AlphaGo Zero And AlphaZero, RomiChess done better
[quote="Michael Sherwin"]
I am most skeptical about interview style reporting. Reporters will often take a few thin facts and weave a whole story around it that has lots of inaccuracies and often has just outright made up garbage.
I am less skeptical about papers written directly by the authors. Still I'm a bit skeptical because they do not always tell all.
The reason I started this thread is because I looked at the games and the moves by Alpha0 had the same learned feel to them that Romi's moves have when Romi would win against a superior oppenent.
[/quote]
Always this is the situation when the press and the money meet each other.
Based on the similarity between moves of Romi and moves of AlphaZero
one should think that AlphaZero produced a kind of "learning file" what was used by a normal chess engine - maybe a derivative of Stockfish...
Or this is a very absurd supposition?
I am most skeptical about interview style reporting. Reporters will often take a few thin facts and weave a whole story around it that has lots of inaccuracies and often has just outright made up garbage.
I am less skeptical about papers written directly by the authors. Still I'm a bit skeptical because they do not always tell all.
The reason I started this thread is because I looked at the games and the moves by Alpha0 had the same learned feel to them that Romi's moves have when Romi would win against a superior oppenent.
[/quote]
Always this is the situation when the press and the money meet each other.
Based on the similarity between moves of Romi and moves of AlphaZero
one should think that AlphaZero produced a kind of "learning file" what was used by a normal chess engine - maybe a derivative of Stockfish...
Or this is a very absurd supposition?
-
- Posts: 4185
- Joined: Tue Mar 14, 2006 11:34 am
- Location: Ethiopia
Re: AlphaGo Zero And AlphaZero, RomiChess done better
I don't know how this thread evolved into a series one, because I thought it was meant as a tounge-and-cheek comparison.
Are you really equating Romi's learning (so far as I know is just standard book learning) with what AlphaZero is doing ?
To be precise, AlphaZero uses a unique search (MCTS) which is very selective, it orders moves with a deep NN, and evaluates positions also with deep NN. This deep NN is a generic network that can be used in any kind of postion, not like some book learning stuff that avoids particular moves. So I don't see how this unique approach is same as Romi's
If you are talking about learning in general, yes it has been used in chess before AlphaZero, with the most prominent and general one being TD-lambda.
Daniel
Are you really equating Romi's learning (so far as I know is just standard book learning) with what AlphaZero is doing ?
To be precise, AlphaZero uses a unique search (MCTS) which is very selective, it orders moves with a deep NN, and evaluates positions also with deep NN. This deep NN is a generic network that can be used in any kind of postion, not like some book learning stuff that avoids particular moves. So I don't see how this unique approach is same as Romi's
If you are talking about learning in general, yes it has been used in chess before AlphaZero, with the most prominent and general one being TD-lambda.
Daniel
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
I am not addressing the Alpha0 playing algorithm. I understand that it is massively parallel MCTS. That alone makes it far different than Stockfish. I'm not that skeptical about the reporting to believe that someone is lying about the underlying algorithm.corres wrote:Always this is the situation when the press and the money meet each other.Michael Sherwin wrote:
I am most skeptical about interview style reporting. Reporters will often take a few thin facts and weave a whole story around it that has lots of inaccuracies and often has just outright made up garbage.
I am less skeptical about papers written directly by the authors. Still I'm a bit skeptical because they do not always tell all.
The reason I started this thread is because I looked at the games and the moves by Alpha0 had the same learned feel to them that Romi's moves have when Romi would win against a superior oppenent.
Based on the similarity between moves of Romi and moves of AlphaZero
one should think that AlphaZero produced a kind of "learning file" what was used by a normal chess engine - maybe a derivative of Stockfish...
Or this is a very absurd supposition?
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
This is an old argument about Romi's learning being "standard book learning". It is not. The reinforcement learning in RomiChess is stored in a tree structure that doubles as an opening book. That much is true. However, the subtree is loaded into the hash before each search along with its learned rewards and penalties earned from previous results. This learned value affects which move the search decides is the best. That has nothing to do with an opening book!Daniel Shawul wrote:I don't know how this thread evolved into a series one, because I thought it was meant as a tounge-and-cheek comparison.
Are you really equating Romi's learning (so far as I know is just standard book learning) with what AlphaZero is doing ?
To be precise, AlphaZero uses a unique search (MCTS) which is very selective, it orders moves with a deep NN, and evaluates positions also with deep NN. This deep NN is a generic network that can be used in any kind of postion, not like some book learning stuff that avoids particular moves. So I don't see how this unique approach is same as Romi's
If you are talking about learning in general, yes it has been used in chess before AlphaZero, with the most prominent and general one being TD-lambda.
Daniel
I don't know if tongue in cheek is the correct terminology for what I intended but I was not being serious that Alpha0 was very similar to RomiChess at all. I was just making the point that after looking at A0's moves, given my experience, the learned moves had the same look and feel to them that Romi's learning produced. That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist. All I was trying to do was demystify somewhat the phenomenon that is A0.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
And I would like to add that if SF had the same type of reinforcement learning that Romi has then a trained SF could be 400 (maybe a 1000) elo or more higher than in its non trained state. That is after a million games had been played by SF. It would take a cooperative effort and a way to merge learn files if it were to be done in a reasonable amount of time. SF would have to include WB protocol or UCI would have to add a result command if it has not done that already in the last decade. And of course I suspect that the SF team could improve on my base algorithm without difficulty.Michael Sherwin wrote:This is an old argument about Romi's learning being "standard book learning". It is not. The reinforcement learning in RomiChess is stored in a tree structure that doubles as an opening book. That much is true. However, the subtree is loaded into the hash before each search along with its learned rewards and penalties earned from previous results. This learned value affects which move the search decides is the best. That has nothing to do with an opening book!Daniel Shawul wrote:I don't know how this thread evolved into a series one, because I thought it was meant as a tounge-and-cheek comparison.
Are you really equating Romi's learning (so far as I know is just standard book learning) with what AlphaZero is doing ?
To be precise, AlphaZero uses a unique search (MCTS) which is very selective, it orders moves with a deep NN, and evaluates positions also with deep NN. This deep NN is a generic network that can be used in any kind of postion, not like some book learning stuff that avoids particular moves. So I don't see how this unique approach is same as Romi's
If you are talking about learning in general, yes it has been used in chess before AlphaZero, with the most prominent and general one being TD-lambda.
Daniel
I don't know if tongue in cheek is the correct terminology for what I intended but I was not being serious that Alpha0 was very similar to RomiChess at all. I was just making the point that after looking at A0's moves, given my experience, the learned moves had the same look and feel to them that Romi's learning produced. That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist. All I was trying to do was demystify somewhat the phenomenon that is A0.
However, it does not have to be SF. It could be anyone. So I am really baffled that nobody has done that in the last 11+ years!
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 349
- Joined: Sat Aug 06, 2016 8:31 pm
- Location: United States
Re: AlphaGo Zero And AlphaZero, RomiChess done better
My understanding is its training was only via self-play starting from a blank slate, i.e., knowing only the rules.Michael Sherwin wrote:That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist.
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
A quote from one of Milos post.zenpawn wrote:My understanding is its training was only via self-play starting from a blank slate, i.e., knowing only the rules.Michael Sherwin wrote:That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist.
"When starting from each human opening,
AlphaZero convincingly defeated Stockfish, suggesting that it has indeed mastered a wide spectrum of chess play."
This is evidence of pre match training against SF. How many human opening positions were trained against? Here is more of the quote.
"Finally, we analysed the chess knowledge discovered by AlphaZero. Table 2 analyses the
most common human openings (those played more than 100,000 times in an online database of human chess games"
So we not only have pre match training against SF but they used the most common human played positions to conduct that training.
So my original observation based on my experience with reinforcement learning that they must've used a human database and pre training against SF appears to be quite accurate.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 349
- Joined: Sat Aug 06, 2016 8:31 pm
- Location: United States
Re: AlphaGo Zero And AlphaZero, RomiChess done better
I took those to be games played after the self-play training or at least not used to learn. The thing is called Zero for the very reason that it doesn't start with a database of games.Michael Sherwin wrote:A quote from one of Milos post.zenpawn wrote:My understanding is its training was only via self-play starting from a blank slate, i.e., knowing only the rules.Michael Sherwin wrote:That lead me to the conclusion that A0 pretrained for the match against SF or at a minimum loaded and learned against SF games. Some post above seem to verify that observation. I did not read any white papers on A0. I only read some reports by journalist.
"When starting from each human opening,
AlphaZero convincingly defeated Stockfish, suggesting that it has indeed mastered a wide spectrum of chess play."
This is evidence of pre match training against SF. How many human opening positions were trained against? Here is more of the quote.
"Finally, we analysed the chess knowledge discovered by AlphaZero. Table 2 analyses the
most common human openings (those played more than 100,000 times in an online database of human chess games"
So we not only have pre match training against SF but they used the most common human played positions to conduct that training.
So my original observation based on my experience with reinforcement learning that they must've used a human database and pre training against SF appears to be quite accurate.
-
- Posts: 3657
- Joined: Wed Nov 18, 2015 11:41 am
- Location: hungary
Re: AlphaGo Zero And AlphaZero, RomiChess done better
[quote="Michael Sherwin"]
I am not addressing the Alpha0 playing algorithm. I understand that it is massively parallel MCTS. That alone makes it far different than Stockfish. I'm not that skeptical about the reporting to believe that someone is lying about the underlying algorithm.
[/quote]
Sorry, but no one speak about lying.
As you also stated the AlphaZero team give to public very small information about the details.
From the texts we know that the massively parallel MCTS was used during the learning process. But playing against Stockfish it is very doubtful to use MCTS, I think.
I am not addressing the Alpha0 playing algorithm. I understand that it is massively parallel MCTS. That alone makes it far different than Stockfish. I'm not that skeptical about the reporting to believe that someone is lying about the underlying algorithm.
[/quote]
Sorry, but no one speak about lying.
As you also stated the AlphaZero team give to public very small information about the details.
From the texts we know that the massively parallel MCTS was used during the learning process. But playing against Stockfish it is very doubtful to use MCTS, I think.