Reinforcement Learning (RL) in real time paradigm

Michael Sherwin · Post by **Michael Sherwin** » Tue Jan 15, 2019 8:50 pm

Guenther wrote: ↑Tue Jan 15, 2019 5:02 pm
Rein Halbersma wrote: ↑Tue Jan 15, 2019 4:52 pm
A similar idea has been proposed before I think: https://papers.nips.cc/paper/3722-boots ... search.pdf
It must be over 10 years I have seen Joels name mentioned somewhere, thanks for the paper.
https://www.chessprogramming.org/Bodo
I didn't even know he created 'Meep', because that was shortly before a long hiatus on my side.
https://www.chessprogramming.org/Meep

One of those shorter exchanges between Michael and Bob in the WB forum about Romis learning.
http://www.open-aurec.com/wbforum/viewt ... f=4&t=4835

Thanks Guenther, that WB forum thread says it pretty well!

PK · Post by PK » Thu Jan 17, 2019 7:05 pm

Michael, I will be glad if you prove me wrong. But even then I would search for ways to reduce game length. You want to feed transposition table with additional information, trusting that it will help to shape the final search. The idea looks good, as long as this information has a chance to be accessed. If you search 20 plies ahead, then there will be no use for entries from ply 40.

Michael Sherwin · Post by **Michael Sherwin** » Fri Jan 18, 2019 12:43 am

PK wrote: ↑Thu Jan 17, 2019 7:05 pm Michael, I will be glad if you prove me wrong. But even then I would search for ways to reduce game length. You want to feed transposition table with additional information, trusting that it will help to shape the final search. The idea looks good, as long as this information has a chance to be accessed. If you search 20 plies ahead, then there will be no use for entries from ply 40.

There is also back propagation from 40 ply down toward the root and that is why RL gets fantastic results. If it gets done it will have to be by someone else for personal health reasons.

Thanks for your input!

Reinforcement Learning (RL) in real time paradigm

Re: Reinforcement Learning (RL) in real time paradigm

Re: Reinforcement Learning (RL) in real time paradigm

Re: Reinforcement Learning (RL) in real time paradigm