So if Romi would have trained against Crafty 100 games in every position that was in a human database 10,000 times or more how do you think Romi would have done against Crafty in a follow up match if Crafty used its tournament book?Rodolfo Leoni wrote:But against Crafty that specific position was.... startposition!Michael Sherwin wrote:Hi Rodolfo! Yes I remember those experiments. Starting from a new learn file Romi was able to win 100 game matches against both Rybka and Crafty when starting from a specific position. Thanks for reminding me!Rodolfo Leoni wrote:Hi Mike,Michael Sherwin wrote:In January of 2006 IIRC (not exactly sure) I released RomiChess ver P2a. The new version had learning. It had two types of learning, monkey see monkey do and learning adapted from Pavlov's dog experiments. I did not know it at the time but the second type of learning was called reinforcement learning. I just found out very recently that reinforcement learning was invented for robotics control in 1957 the year that I was born, strange. Anyway, as far as I know I reinvented it and was the first to put reinforcement learning into a chess program. The reason i'm apparently patting myself on the back is rather to let people know that I recognise certain aspects of this AlphaZero phenom. For example, using Glaurung 2.x as a test opponent Romi played 20 matches against Glaurung using the ten Nunn positions. On pass one Romi scored 5% against Glaurung. On the 20th pass Romi scored 95%. That is how powerful the learning is! The moves that Romi learned to beat Glaurung were very distinctive looking. They are learned moves so they are not determined by a natural chess playing evaluation but rather an evaluation tweaked by learned rewards and penalties. Looking at the games between AlphaZero and Stockfish I see the same kind of learned moves. In RomiChess one can start with a new learn.dat file and put millionbase.pgn in the same directory as Romi and type merge millionbase.pgn and Romi will learn from all those games. When reading about AlphaZero there is mostly made up reporting. That is what reporters do. They take one or two known facts and make up a many page article that is mostly bunk. The AlphaZero team has released very little actual info. They released that it uses reinforcement learning and that a database of games were loaded in. Beyond that not much is known. But looking at the games against Stockfish it looks as though AlphaZero either trained against Stockfish before the recorded match or entered a pgn of Stockfish games. Stockfish does have some type of randomness to its moves so it can't be totally dominated like Romi dominated Glaurung that had no randomness. So basically take an engine about as strong as Stockfish and give it reinforcement learning and the result is exactly as expected!
It's always a pleasure to see you .
Don't forget the matches Romi-Rybka on a theme variation and Romi-Crafty on full standard games... Romi won all of them on 100 games matches, with empty learning file.
AlphaGo Zero And AlphaZero, RomiChess done better
Moderators: hgm, Rebel, chrisw
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 545
- Joined: Tue Jun 06, 2017 4:49 pm
- Location: Italy
Re: AlphaGo Zero And AlphaZero, RomiChess done better
Esay: +100 =0 -0.Michael Sherwin wrote:So if Romi would have trained against Crafty 100 games in every position that was in a human database 10,000 times or more how do you think Romi would have done against Crafty in a follow up match if Crafty used its tournament book?Rodolfo Leoni wrote:But against Crafty that specific position was.... startposition!Michael Sherwin wrote:Hi Rodolfo! Yes I remember those experiments. Starting from a new learn file Romi was able to win 100 game matches against both Rybka and Crafty when starting from a specific position. Thanks for reminding me!Rodolfo Leoni wrote:Hi Mike,Michael Sherwin wrote:In January of 2006 IIRC (not exactly sure) I released RomiChess ver P2a. The new version had learning. It had two types of learning, monkey see monkey do and learning adapted from Pavlov's dog experiments. I did not know it at the time but the second type of learning was called reinforcement learning. I just found out very recently that reinforcement learning was invented for robotics control in 1957 the year that I was born, strange. Anyway, as far as I know I reinvented it and was the first to put reinforcement learning into a chess program. The reason i'm apparently patting myself on the back is rather to let people know that I recognise certain aspects of this AlphaZero phenom. For example, using Glaurung 2.x as a test opponent Romi played 20 matches against Glaurung using the ten Nunn positions. On pass one Romi scored 5% against Glaurung. On the 20th pass Romi scored 95%. That is how powerful the learning is! The moves that Romi learned to beat Glaurung were very distinctive looking. They are learned moves so they are not determined by a natural chess playing evaluation but rather an evaluation tweaked by learned rewards and penalties. Looking at the games between AlphaZero and Stockfish I see the same kind of learned moves. In RomiChess one can start with a new learn.dat file and put millionbase.pgn in the same directory as Romi and type merge millionbase.pgn and Romi will learn from all those games. When reading about AlphaZero there is mostly made up reporting. That is what reporters do. They take one or two known facts and make up a many page article that is mostly bunk. The AlphaZero team has released very little actual info. They released that it uses reinforcement learning and that a database of games were loaded in. Beyond that not much is known. But looking at the games against Stockfish it looks as though AlphaZero either trained against Stockfish before the recorded match or entered a pgn of Stockfish games. Stockfish does have some type of randomness to its moves so it can't be totally dominated like Romi dominated Glaurung that had no randomness. So basically take an engine about as strong as Stockfish and give it reinforcement learning and the result is exactly as expected!
It's always a pleasure to see you .
Don't forget the matches Romi-Rybka on a theme variation and Romi-Crafty on full standard games... Romi won all of them on 100 games matches, with empty learning file.
We shouldn't forget they were different times for computer chess. On single CPUs (deterministic chess) it's easier to find opponent's weaknesses. With multicore engines it becomes a bit harder because engines often change their PVs. So I guess Romi would win but it'd suffer some lost.
About AlphaZ, I think that's an hardware revolution and engines strenght (or learning) has nothing to do with the result. It's a different way to build a software, a different pattern of evaluation, and a learning which is much more similar to KnightCap than any other. With a difference: at those times, KnightCap learning could never work.
It'd have been far more interesting a match AlphaZ-Stockfish 9 (when released), but if you give SF9 some learning features. Romi style or Critter style, it doesn't matter. We'd have a learning vs. learning in a match with engines of similar level. Or maybe SF9 would have been strong enough to win the match...
We'll never know becaure that was mere marketing so they needed to win... That doesn't mean pruduct is bad. It's probably great, but if you want to sell a great (and expensive) product you need to do a lot of advertising about an unbeliveable preformance. So you spend a lot of money because you want to earn a lot more.
Just two, max three cents.
F.S.I. Chess Teacher
-
- Posts: 493
- Joined: Wed Mar 15, 2006 6:13 am
- Location: Curitiba - PR - BRAZIL
Re: AlphaGo Zero And AlphaZero, RomiChess done better
This is not a novelty.
R.J. Fischer did that long time ago.
He learned and memorized all of Spassky´s games. The result everybody knows.
R.J. Fischer did that long time ago.
He learned and memorized all of Spassky´s games. The result everybody knows.
A. Ponti
AMD Ryzen 1800x, Windows 10.
FIDE current ratings: standard 1913, rapid 1931
AMD Ryzen 1800x, Windows 10.
FIDE current ratings: standard 1913, rapid 1931
-
- Posts: 5
- Joined: Tue Jun 20, 2017 12:23 pm
Re: AlphaGo Zero And AlphaZero, RomiChess done better
If you really went from 5% to 95%, that's obviously overfitting.
Anyone can achieve 100% against a deterministic adversary
Anyone can achieve 100% against a deterministic adversary
-
- Posts: 545
- Joined: Tue Jun 06, 2017 4:49 pm
- Location: Italy
Re: AlphaGo Zero And AlphaZero, RomiChess done better
That's why Critter (2010) and Stockfish PA GTB (2014) have been the last engines with a structured learning system. The undeterministic behavior of multi-CPU engines made position learning almost useless. Almost. I recently posted a match result of a match SF PA GTB-Asmfish (and I presume there was a huge ELO difference, more than 200 IMO. On a 100 games match SF PA GTB won 52-48 with a theme position and I wrote that one million games learning file would suffice to win a match with standard startposition too. A deep and strongly "pruned" opening book would be needed in that case. But there is an inconvenience: if you change match conditions (e.g. learning was performed 1 sec/move and match is 10 secs/move, or if opponent changes its opening book) then all that position learning becomes useless. That's why Romichess learning system is more effective for matches: in fact, it's not a classic position learning. I'd define it a "book deep learning".Akababa wrote: Anyone can achieve 100% against a deterministic adversary
As I said, it's not matter of software. AlphaZ learning couldn't compete with Romi learning on traditional hardware. We can discuss about engines strenght difference, tough. If AlphaZ (empty learning file) is as strong as SF8, Romi couldn't compete with it because of ELO difference. But if somebody gives SF8 the Romi learning system and match it with a rewrite of AlphaZ for windows (trying to keep its learning system) then result could be quite embarassing for Google team...
This is NOT to criticize AlphaZ and their work. It's really a great result and there'll be a new frontier of computer chess science. A pity that frontier will be for few people until hardware prizes will become reasonable.
F.S.I. Chess Teacher
-
- Posts: 2283
- Joined: Sat Jun 02, 2012 2:13 am
Re: AlphaGo Zero And AlphaZero, RomiChess done better
I suggested introducing learning functionality into Komodo several times over the last few years, and all I got was a "we'll consider it" type of answer. I specifically gave Critter and SF_PA_GTB as an example of how it could be done, and that was freeware. They're even on good terms with Jesse and R. Vida so they could reach out for help on that one, if necessary.Rodolfo Leoni wrote:That's why Critter (2010) and Stockfish PA GTB (2014) have been the last engines with a structured learning system. The undeterministic behavior of multi-CPU engines made position learning almost useless. Almost. I recently posted a match result of a match SF PA GTB-Asmfish (and I presume there was a huge ELO difference, more than 200 IMO. On a 100 games match SF PA GTB won 52-48 with a theme position and I wrote that one million games learning file would suffice to win a match with standard startposition too. A deep and strongly "pruned" opening book would be needed in that case. But there is an inconvenience: if you change match conditions (e.g. learning was performed 1 sec/move and match is 10 secs/move, or if opponent changes its opening book) then all that position learning becomes useless. That's why Romichess learning system is more effective for matches: in fact, it's not a classic position learning. I'd define it a "book deep learning".Akababa wrote: Anyone can achieve 100% against a deterministic adversary
As I said, it's not matter of software. AlphaZ learning couldn't compete with Romi learning on traditional hardware. We can discuss about engines strenght difference, tough. If AlphaZ (empty learning file) is as strong as SF8, Romi couldn't compete with it because of ELO difference. But if somebody gives SF8 the Romi learning system and match it with a rewrite of AlphaZ for windows (trying to keep its learning system) then result could be quite embarassing for Google team...
This is NOT to criticize AlphaZ and their work. It's really a great result and there'll be a new frontier of computer chess science. A pity that frontier will be for few people until hardware prizes will become reasonable.
It looks like learning has been sorely neglected by both the programming and testing community (since it would distort the other ratings) for many years. However, one could test two instances of the same program, one with the other without learning. One could even remove the learning games from the rating list afterwards, so as to avoid distortions.
It probably can be done without tying up too many resources, since the engines that have learning features are quite few - Critter, Baron, Phalanx, RomiChess, of course, and maybe a few others.
Thanks for bringing it up again, Rodolfo! Learning in chess is always relevant.
Regards,
CL
-
- Posts: 3196
- Joined: Fri May 26, 2006 3:00 am
- Location: WY, USA
- Full name: Michael Sherwin
Re: AlphaGo Zero And AlphaZero, RomiChess done better
Hi Rodolfo, I have never once openly disagreed with anything that you have said so please do not get upset with me but I disagree with one point. Romi's learning is a bit more than position learning. When Romi learns nodes higher in the tree are affected the most and change sooner. However, as more results come in the moves at the root get better defined. So for example Romi will choose between 1.e4 and 1.d4 which ever gives Romi a better result. That is true from any node in the tree. That is a permanent gain. It may not help win matches against god engines but it will help Romi gain several classes in strength against her contemporaries as demonstrated in Leo's class tournaments where Romi gained two classes and was about to gain a third. And that was based on just a 100 to 200 played games!Rodolfo Leoni wrote:That's why Critter (2010) and Stockfish PA GTB (2014) have been the last engines with a structured learning system. The undeterministic behavior of multi-CPU engines made position learning almost useless. Almost. I recently posted a match result of a match SF PA GTB-Asmfish (and I presume there was a huge ELO difference, more than 200 IMO. On a 100 games match SF PA GTB won 52-48 with a theme position and I wrote that one million games learning file would suffice to win a match with standard startposition too. A deep and strongly "pruned" opening book would be needed in that case. But there is an inconvenience: if you change match conditions (e.g. learning was performed 1 sec/move and match is 10 secs/move, or if opponent changes its opening book) then all that position learning becomes useless. That's why Romichess learning system is more effective for matches: in fact, it's not a classic position learning. I'd define it a "book deep learning".Akababa wrote: Anyone can achieve 100% against a deterministic adversary
As I said, it's not matter of software. AlphaZ learning couldn't compete with Romi learning on traditional hardware. We can discuss about engines strenght difference, tough. If AlphaZ (empty learning file) is as strong as SF8, Romi couldn't compete with it because of ELO difference. But if somebody gives SF8 the Romi learning system and match it with a rewrite of AlphaZ for windows (trying to keep its learning system) then result could be quite embarassing for Google team...
This is NOT to criticize AlphaZ and their work. It's really a great result and there'll be a new frontier of computer chess science. A pity that frontier will be for few people until hardware prizes will become reasonable.
If you are on a sidewalk and the covid goes beep beep
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
Just step aside or you might have a bit of heat
Covid covid runs through the town all day
Can the people ever change their ways
Sherwin the covid's after you
Sherwin if it catches you you're through
-
- Posts: 545
- Joined: Tue Jun 06, 2017 4:49 pm
- Location: Italy
Re: AlphaGo Zero And AlphaZero, RomiChess done better
Hi Mike, that's not a disagreement at all. That's what I tried to say.Michael Sherwin wrote:Hi Rodolfo, I have never once openly disagreed with anything that you have said so please do not get upset with me but I disagree with one point. Romi's learning is a bit more than position learning. When Romi learns nodes higher in the tree are affected the most and change sooner. However, as more results come in the moves at the root get better defined. So for example Romi will choose between 1.e4 and 1.d4 which ever gives Romi a better result. That is true from any node in the tree. That is a permanent gain. It may not help win matches against god engines but it will help Romi gain several classes in strength against her contemporaries as demonstrated in Leo's class tournaments where Romi gained two classes and was about to gain a third. And that was based on just a 100 to 200 played games!Rodolfo Leoni wrote:
............................ That's why Romichess learning system is more effective for matches: in fact, it's not a classic position learning. I'd define it a "book deep learning".
..................................................
F.S.I. Chess Teacher
-
- Posts: 545
- Joined: Tue Jun 06, 2017 4:49 pm
- Location: Italy
Re: AlphaGo Zero And AlphaZero, RomiChess done better
I think several things are going to happen. Google hardware (if cheap enough) will enforce some kind of learning encoding on every programmer who wants to try to stay atop. If there'll be a new StockfishZ, KomodoZ or HoudiniZ then it can't be avoided. But until the new hardware willl be available at reasonable costs I guess nothing will change. I see no logic in introducing a system when still working with the "old" traditional hardware. Programmers would need to rewrite it (and the whole engine IMO) to adapt it to the new hardware.carldaman wrote:
I suggested introducing learning functionality into Komodo several times over the last few years, and all I got was a "we'll consider it" type of answer. I specifically gave Critter and SF_PA_GTB as an example of how it could be done, and that was freeware. They're even on good terms with Jesse and R. Vida so they could reach out for help on that one, if necessary.
It looks like learning has been sorely neglected by both the programming and testing community (since it would distort the other ratings) for many years. However, one could test two instances of the same program, one with the other without learning. One could even remove the learning games from the rating list afterwards, so as to avoid distortions.
It probably can be done without tying up too many resources, since the engines that have learning features are quite few - Critter, Baron, Phalanx, RomiChess, of course, and maybe a few others.
Thanks for bringing it up again, Rodolfo! Learning in chess is always relevant.
Regards,
CL
I think it'll be matters of YEARS. And there's another possible scenario too: this hardware, maybe, will never be available.
One thing is 100% sure: when Google hardware will be available then correspondence chess will die. What's the point in playing games if the learning feature is advanced at a point where it only plays perfect games?
F.S.I. Chess Teacher
-
- Posts: 5566
- Joined: Tue Feb 28, 2012 11:56 pm
Re: AlphaGo Zero And AlphaZero, RomiChess done better
There is no reason to think that AlphaZero plays only perfect games. It might not even know about the wrong corner in KBNK. (Does that come up often enough in 44 million games for AlphaZero to figure it out? Maybe not, maybe it does, but even if it does there will be other patterns it won't have seen often enough.)Rodolfo Leoni wrote:One thing is 100% sure: when Google hardware will be available then correspondence chess will die. What's the point in playing games if the learning feature is advanced at a point where it only plays perfect games?
Show it one of those positions that are easy for humans, that fool all top engines and that never come up in real games, and I guarantee you AlphaZero will not have a clue either.