Well...jdart wrote: ↑Tue Jan 14, 2020 2:48 pmI didn't get very far with this but I was using a fairly long time control. Shorter TC would give less of a draw rate and faster learning but also I think more randomness in the results.Sounds like a weak move selection algorithm, or not using all the knowledge you have available, or too high expectation of the learning rate, or a combination of all three.
I'm kind of curious what was the time controls you were using? I think this plays a huge role in learning rate per unit time spent.
Did you start with an empty book and have the program build it from scratch? If you did, then using very very short time controls to start with is in order for two reasons. The first is you need to have the program play as many games as possible to fill the book with leaf nodes. This takes a while and is best done without a GUI involved. A GUI will slow things down by an order of magnitude. SF with a 1 ply search is much greater than 1200 ELO better than a random player. Unknown how much greater, since I was never able to get a draw with a random player. This is both fast enough and strong enough to fill the book with non-random leaf nodes pretty quickly. How quickly depends on how deep a book you want AND the rules for adding new nodes. With this short of a time control you “could” add all legal moves less than a given depth from the root. This is a very loose add policy and something like the top 3 to 5 moves will bloat the book less after a depth of 2 to 4 ply's has been reached. After you have a reasonable size book you can start ramping up the search depth of your test games and change the rules for adding nodes to something a little less loose, like top 2 move so you always have at least one alternative move at each node. At search depth of between 15 and 23 plies you should have a reasonable book that you can build on with more “normal” time controls.
While it's true that it's somewhat less reliable it's also orders of magnitude faster at least to start with. If your building from scratch or just adding new nodes this should be the preferred method until you have enough experience to use a better method. So all your leaf nodes should start out being selected by search only. Only after some number of games from that leaf node have been played should you switch to some other method.The problem is, I think even in the best case, you need a very large number of games to get convergence. In a lot of cases one move scores 60% and another 55%. That is significant but getting to the point where you see that difference is going to take time. You could also optimize for scores out of the search, which I think a lot of people have done, but IMO that is less reliable. For example, something like the Ruy Lopez Marshall Gambit where Black is a pawn down may give you minus scores, but most of the endgames are drawn.
Trying to build a learning book from scratch ( or even modifying a standard book ) by using only long time control games is way beyond most peoples available computing power. i.e. there is no reason to waste large amounts of computing time on long time control games for sub-par / losing moves when most can be proved to be losing with much shorter time controls. Only after these (most) moves have been ferreted out at shorter time controls should you move to longer time controls. Otherwise you waste huge amounts of computer time.
Does than make any sense to you?