Page 2 of 2

Re: Opening book read speed

Posted: Mon Aug 13, 2018 10:38 am
by xr_a_y
All very good advices ! thanks a lot.

Re: Opening book read speed

Posted: Mon Aug 13, 2018 5:01 pm
by Ras
I'm doing a similar approach, though with a much smaller book (22,000 moves, 12,000 positions). The source format is line based, then I put a self-written opening book compiler in-between, and I get a binary format sorted by ascending position CRC-32. On top of that, there are also 256 cache indices to avoid scanning through half the book on average.

I'm using a primary CRC-32, the ethernet CRC, and a secondary CRC-8 so that I effectively have a CRC-40. The number of positions has to be much smaller than the root of 2^40, i.e. 1 million, which is easily fulfilled. Otherwise, you get birthday paradoxon collisions.

Memory footprint: per position, 4 bytes for CRC-32, 1 byte for the number of moves. Then 1 byte for each move from and to, and 1.8 moves per position, that's 3.7 bytes. In total 8.7 bytes per position times 12,000 positions gives around a 100 kB.

I've run a benchmark where I used my opening book copied around a thousand times, which gave me a 600 MB line based text file. My old AMD Phenom 2 1100 T crunched that through in about a 100 seconds. Since there is one even super-linear step, the sorting with O(n * lg n), we can say that fictional linear scaling would be a lower bound for the factor. 100s / 600 MB * 19 MB = 3 seconds.

I'd say that you could gain at least a factor of three - or even more if you reference PC is faster than mine.

Since you have longer lines, I think you will have fewer moves per position, let's assume 1.5 moves. With my approach, that would top out at around 50 MB. However, CRC-40 will not work anymore because you'd have around 6 million positions, which would run into birthday collisions, so you'd have to go with two independent CRC-32. That will be additional 3 bytes per position (one byte, like the CRC-8, is fiddled into free bits), and you get to around 70 MB.

Re: Opening book read speed

Posted: Mon Aug 13, 2018 5:23 pm
by xr_a_y
Using Martin advices and some more code clean up I get 3.1 sec for reading my binary book, which is compatible with a xboard use...

I'll need to add support for classic stuff soon for both openings and end-games.

But curiously, Weini is loosing elo using this book, apparently the engine is going out of book in position it does not like much ...

Re: Opening book read speed

Posted: Mon Aug 13, 2018 10:17 pm
by Ras
Another idea, do you use some kind of caching? Often, a line based book contains the same say 10 starting plies in several consecutive lines, then different 11th moves. But all the work for the first 10 has already been done in the first of these lines. If you used a small hash table, you could skip a lot of duplicate work both for validation and recording a position/move pair. You could even use the memory for the regular hash tables for that through another base pointer type, just check for proper alignment.

Re: Opening book read speed

Posted: Mon Aug 13, 2018 10:51 pm
by hgm
xr_a_y wrote: Mon Aug 13, 2018 5:23 pmBut curiously, Weini is loosing elo using this book, apparently the engine is going out of book in position it does not like much ...
I had the same thing with Joker. If your engine is reasonably well tuned for opening play, you might benefit from bringing the opponent out of book early. E.g. some engines around 2000 Elo would never castle if the book doesn't tell them to do it.

Re: Opening book read speed

Posted: Tue Aug 14, 2018 10:22 am
by xr_a_y
Ras wrote: Mon Aug 13, 2018 10:17 pm Another idea, do you use some kind of caching? Often, a line based book contains the same say 10 starting plies in several consecutive lines, then different 11th moves. But all the work for the first 10 has already been done in the first of these lines. If you used a small hash table, you could skip a lot of duplicate work both for validation and recording a position/move pair. You could even use the memory for the regular hash tables for that through another base pointer type, just check for proper alignment.
I used to have a cache for move generator and threats detection but didn't test this for a while. I'll have a look.

Re: Opening book read speed

Posted: Tue Aug 14, 2018 10:24 am
by xr_a_y
hgm wrote: Mon Aug 13, 2018 10:51 pm I had the same thing with Joker. If your engine is reasonably well tuned for opening play, you might benefit from bringing the opponent out of book early. E.g. some engines around 2000 Elo would never castle if the book doesn't tell them to do it.
Indeed, Weini without a book plays reasonably well in the opening (it will occupy center thanks to PSQT, develop knight and bishop, and castle fast enough). I was hoping that using a bigger books will bring some elo ...

Re: Opening book read speed

Posted: Tue Aug 14, 2018 5:31 pm
by Joost Buijs
hgm wrote: Mon Aug 13, 2018 10:51 pm
xr_a_y wrote: Mon Aug 13, 2018 5:23 pmBut curiously, Weini is loosing elo using this book, apparently the engine is going out of book in position it does not like much ...
I had the same thing with Joker. If your engine is reasonably well tuned for opening play, you might benefit from bringing the opponent out of book early. E.g. some engines around 2000 Elo would never castle if the book doesn't tell them to do it.
This will only work when you play against weak opponents, most opening lines have been examined to death, if you want to bring the opponent out of book early it means that you have to play a move that has been proven to be inferior, when you play a inferior move against a strong opponent it is very likely that you will lose the game very quickly.

Most books are based on a huge number of games from strong engines (Stockfish, Komodo, Houdini to name a few and maybe some correspondence games), for instance my own book is based on ~5.5 mln. high quality games I collected over the years.
One of the problems is how to select the move you want to play in a certain position, if you want to add randomness you got to have several different moves available, if you always want to play the (best) move only one move will suffice. Usually playing the move with the highest frequency works fine, but that doesn't necessarily mean that this is the best move, you also have to look at the WLD rate and use some statistics to select (what you think) is the best move.
The drawback of using a large book like this is that the opponent often uses a book based on the very same games and that it happens often after playing 35 moves out of book that you end up in a dead drawn position that is not winnable anymore, even against (much) weaker opponents.