Page 2 of 2

Re: Reinforcement Learning (RL) in real time paradigm

Posted: Tue Jan 15, 2019 7:50 pm
by Michael Sherwin
Guenther wrote:
Tue Jan 15, 2019 4:02 pm
Rein Halbersma wrote:
Tue Jan 15, 2019 3:52 pm

A similar idea has been proposed before I think: https://papers.nips.cc/paper/3722-boots ... search.pdf
It must be over 10 years I have seen Joels name mentioned somewhere, thanks for the paper.
https://www.chessprogramming.org/Bodo
I didn't even know he created 'Meep', because that was shortly before a long hiatus on my side.
https://www.chessprogramming.org/Meep

One of those shorter exchanges between Michael and Bob in the WB forum about Romis learning.
http://www.open-aurec.com/wbforum/viewt ... f=4&t=4835
Thanks Guenther, that WB forum thread says it pretty well! :D

Re: Reinforcement Learning (RL) in real time paradigm

Posted: Thu Jan 17, 2019 6:05 pm
by PK
Michael, I will be glad if you prove me wrong. But even then I would search for ways to reduce game length. You want to feed transposition table with additional information, trusting that it will help to shape the final search. The idea looks good, as long as this information has a chance to be accessed. If you search 20 plies ahead, then there will be no use for entries from ply 40.

Re: Reinforcement Learning (RL) in real time paradigm

Posted: Thu Jan 17, 2019 11:43 pm
by Michael Sherwin
PK wrote:
Thu Jan 17, 2019 6:05 pm
Michael, I will be glad if you prove me wrong. But even then I would search for ways to reduce game length. You want to feed transposition table with additional information, trusting that it will help to shape the final search. The idea looks good, as long as this information has a chance to be accessed. If you search 20 plies ahead, then there will be no use for entries from ply 40.
There is also back propagation from 40 ply down toward the root and that is why RL gets fantastic results. If it gets done it will have to be by someone else for personal health reasons. :( Thanks for your input! :D