Anybody tried Logistello's book learning for chess?

Daniel Shawul · Post by **Daniel Shawul** » Thu Jan 02, 2014 1:30 am

This is the SoftMax strategy for multi-armed bandit problems that chooses book moves according to Gibbs distribution. There are many other "probability-matching" variants for selection between the book move and new alternatives. But these algorithms become "fixed" because the heuristic eval() does not change no matter how many times you play it. Once the alternative book move is added, one should keep track of a WDL score for it, whose average you sort of use instead of eval, to guide selection. So the UCB formula is a better one. After the book moves are added , we should forget about the heuristic evals() and keep track of WDL instead.

Also how important is the original book move compared to new ones added via heuristic search/evaluation? I would guess the regret of not playing the original book move should be very high, and switching to other alternatives just after one loss maybe a bad idea.

Edit: Infact the idea of book learning seems to be exactly the same as the procedure for growing and expanding a UCT tree in MCTS. The only difference is that the book is on disk. So we use UCB to select based on the number of WDL at each node, and expand along most promising lines using, prune tree if we have a limit on book size etc...

mvk · Post by **mvk** » Thu Jan 02, 2014 6:11 pm

mvk wrote:
Rémi Coulom wrote:Thanks for your explanations. I am planning something similar. Did you measure the strength improvement you get from your book compared to other books?
Yes I did. I pitted this system against a Crafty, with its own book learning on, and the relative gain was in the 50-80 elo range by the time I removed Crafty from my testing pool.

Now that I've looked again at my data in more detail, I underestimated the learning effect above. Here is a plot with the actual results after 80k games with slow time control. It ran for a long time, about one year. These games were run on a 6 core processor (AMD 1100T), each core playing 1 game at a time, no pondering, 256MB hash, TC 20 min + 20 sec increment. If the program supported it, its book learning was ON (Crafty). The programs were given a wide varied polyglot book. Whenever a non-learning engine got too many bad results, for example if games started to repeat, I swapped its book. This was needed for Stockfish, but not for the others.

[ One disclaimer: The displayed "Elo per game" column doesn't mean too much, because learning was happening in parallel with the other games, and in parallel with dropout expansion and online games. ]

It would be nice to see similar data of the other approaches.

mvk · Post by **mvk** » Sat Jan 11, 2014 1:35 am

I wanted to share this visualisation, just for sharing's sake.

It is a plot of my book's mainlines. That is, all moves where the path error for one of the sides equals 0.0. Just a tiny fraction of the repertoire. The whole book contains 8M position. The server repertoire subset of that is 828k nodes. The tournament repertoire is 241k nodes. The mainlines as plotted here consists of 113k nodes. The lower half represents Black's main lines, the upper half are White's. The principle variation of the book is where these two halves overlap: a straight horizontal line in the middle, a quarter pawn above the horizontal axis where the path error for both sides is 0.0. In fact, in this snapshot there are two principle variations, diverging from move 7. With each iteration of the book update the principle variations change a bit, and as you can see they are typically a bit shorter than the deviating lines.

[pgn]
[Event "bookie"]
[Site "bookie.marcelk.net"]
[White "Rookie"]
[Black "Rookie"]
[Round "1"]
[Date "2014.01.09"]
[Result "*"]

1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O b5 6. Bb3 Bb7 7. a4 Bc5 8. Nxe5
Nxe5 9. d4 Bxd4 10. Qxd4 d6 11. f3 c5 12. Qf2 c4 13. Ba2 *

[Event "bookie"]
[Site "bookie.marcelk.net"]
[White "Rookie"]
[Black "Rookie"]
[Round "1"]
[Date "2014.01.09"]
[Result "*"]

1. e4 e5 2. Nf3 Nc6 3. Bb5 a6 4. Ba4 Nf6 5. O-O b5 6. Bb3 Bb7 7. d4 Nxd4 8.
Nxd4 exd4 9. e5 Ne4 10. c3 Nc5 11. Qxd4 Nxb3 12. axb3 Qe7 13. c4 *
[/pgn]

Each node represents a deep exclusion search. White moves are in green, Black moves in red.

I think it is beautiful and ugly at the same time. A glimpse of the fabric of chess. Enjoy,

Anybody tried Logistello's book learning for chess?

Re: Anybody tried Logistello's book learning for chess?

Re: Anybody tried Logistello's book learning for chess?

Re: Anybody tried Logistello's book learning for chess?