Is there a tool out there to generate an EPD file by sampling from a Polyglot book ?
For example, I want to play up to 8 moves from the book, and dump the resulting FEN into the EPD file, and so on. Of course you would specify the number of sample, so as to obtain an EPD with as many lines as desired.
This will be useful for my CLI tournament program, which will use EPD only for the moment.
Also the added advantage, is that you can really see what these positions are, and run an automated sanity check, like analyze each for 1sec with a top level engine, to make sure the score is close enough to zero, so that the position isn't biaised for white or black.
As we all know, the best test suite is a given set of positions that does not change. Selecting randomly from a book increases the variance of the estimator for two reasons: 1/ another source of randomness whose effect on the result cannot by definition be accounted for by any tool (bayeselo or other) 2/ crappy book lines introduced by using automated book generation method without testing thouroughlly all positions (which is hardly possible when there are millions).
sampling a polyglot book ?
Moderators: hgm, Dann Corbit, Harvey Williamson
-
lucasart
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
sampling a polyglot book ?
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
Michel
- Posts: 2271
- Joined: Mon Sep 29, 2008 1:50 am
Re: sampling a polyglot book ?
The polyglot book entries are hash codes. So they cannot immediately be converted to a FEN.
However polyglot includes a utility which can dump the lines from the book where a line for white is defined as
<white book><black arbitrary><white book><black arbitrary>....
and a line for black is defined as
<white arbitrary><black book><white arbitrary><black book><white arbitrary>....
Here are the first few lines of performance.bin for white
Dump of "/home/vdbergh/SRC/CHESS/Toga142JD_linux_version/performance.bin" for white.
However polyglot includes a utility which can dump the lines from the book where a line for white is defined as
<white book><black arbitrary><white book><black arbitrary>....
and a line for black is defined as
<white arbitrary><black book><white arbitrary><black book><white arbitrary>....
Code: Select all
polyglot dump-book
PolyGlot supports the following options
-bin (default: book.bin)
Input file in PolyGlot book format.
-color
The color for whom to generate the lines.
-out (default: book_<color>.txt)
The name of the output file.Dump of "/home/vdbergh/SRC/CHESS/Toga142JD_linux_version/performance.bin" for white.
Code: Select all
1: 1. e4{33%} a6 2. d4{100%} b5 3. Nf3{64%} e6 4. Bd3{100%} Bb7 5. O-O{75%} c5 6. c3{100%} Nf6 7. Re1{100%}
2: 1. e4{33%} a6 2. d4{100%} b5 3. Nf3{64%} e6 4. Bd3{100%} Bb7 5. Qe2{25%}
3: 1. e4{33%} a6 2. d4{100%} b5 3. Nf3{64%} Bb7 4. Bd3{100%} e6 {trans: line=1, ply=8}
4: 1. e4{33%} a6 2. d4{100%} b5 3. Bd3{36%} Bb7 4. Nf3{100%} {trans: line=3, ply=7}
5: 1. e4{33%} a6 2. d4{100%} e6 3. Nf3{56%} b5 {trans: line=1, ply=6}
6: 1. e4{33%} a6 2. d4{100%} e6 3. Nf3{56%} c5 4. c3{100%} d5 5. e5{100%} Bd7 6. Bd3{100%} cxd4 7. Nxd4{100%} Nc6 8. Nxc6{100%}
7: 1. e4{33%} a6 2. d4{100%} e6 3. Bd3{44%}
8: 1. e4{33%} b6 2. d4{100%} e6 3. Nf3{57%} Bb7 4. Bd3{100%} c5 5. c3{87%} Nf6 6. Qe2{61%} cxd4 7. cxd4{100%}
9: 1. e4{33%} b6 2. d4{100%} e6 3. Nf3{57%} Bb7 4. Bd3{100%} c5 5. c3{87%} Nf6 6. Qe2{61%} d5 7. e5{100%}-
lucasart
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: sampling a polyglot book ?
Thanks. So there's no tool that directly does it then.
On second thought, I'll probably add Polyglot book support to my CLI. I googled around and it seems hard to find books in EPD format, so I'll have to do my own programmatically, so I might as well do full Polyglot support.
I had a look at the code from Stockfish (book.h and book.bin), and it seems that adapting it to my program would be easy (just use my own Board class instead of SF's Position class, and I can almost copy/paste thre rest). As my program is GPL, there would be licensing issues in doing that. And of course, I'll add a big thanks to Marco Costalba in the credits section.
On second thought, I'll probably add Polyglot book support to my CLI. I googled around and it seems hard to find books in EPD format, so I'll have to do my own programmatically, so I might as well do full Polyglot support.
I had a look at the code from Stockfish (book.h and book.bin), and it seems that adapting it to my program would be easy (just use my own Board class instead of SF's Position class, and I can almost copy/paste thre rest). As my program is GPL, there would be licensing issues in doing that. And of course, I'll add a big thanks to Marco Costalba in the credits section.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
Evert
- Posts: 2929
- Joined: Sat Jan 22, 2011 12:42 am
- Location: NL
Re: sampling a polyglot book ?
I think I based Jazz' polybook code on the code in polyglot itself. It's not very hard; my main annoyance with it is the second set of hash codes that are needed. There's an easy solution for that, of course, but I'm irrationally attached to my own hash codes.
It's actually not very hard to create a set of starting positions from a polyglot book: just walk each variation encountered in the opening book and spit out the FEN at the leaf node. You still need the code to read the book, obviously.
I still prefer starting from an EPD set though. I just wish I had some endgame specific ones, for testing end-game evaluation. That's obviously a lot harder than opening positions because if the positions are balanced, they're probably drawn...
It's actually not very hard to create a set of starting positions from a polyglot book: just walk each variation encountered in the opening book and spit out the FEN at the leaf node. You still need the code to read the book, obviously.
I still prefer starting from an EPD set though. I just wish I had some endgame specific ones, for testing end-game evaluation. That's obviously a lot harder than opening positions because if the positions are balanced, they're probably drawn...
-
hgm
- Posts: 27701
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: sampling a polyglot book ?
You could use XBoard for this. Just install a program that does nothing as a a WB v1 engine (doesn't matter what program, as long as it is not an engine), set a TC of 1 sec (0:01 min), let it use GUI book, set the option to save the final position of a game on file (in Options -> Save ), and run a match with as many games as you need.
That should give you your sampling of the terminal book positions, as the first 'engine' to be out of book will forfeit on time after 1 sec.
That should give you your sampling of the terminal book positions, as the first 'engine' to be out of book will forfeit on time after 1 sec.
-
Michel
- Posts: 2271
- Joined: Mon Sep 29, 2008 1:50 am
Re: sampling a polyglot book ?
In a typical polyglot book the majority of positions is not directly connected to the root (for example in a single color book _no_ position is connected to the root). So if you only follow the lines completely in the book you miss a lot of positions.
-
hgm
- Posts: 27701
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: sampling a polyglot book ?
True. But in that case, what would be meant by a representative sample from the book?
Of course the method I sketched can also be used for one-color books, when you let the side for which the book contains no moves be played b a real engine. The opponent, not being an engine, then still forfeits as soon as he gets out of book. (To speed things up you could write an engine that resigns immediately in resonse to 'go'.)
Of course the method I sketched can also be used for one-color books, when you let the side for which the book contains no moves be played b a real engine. The opponent, not being an engine, then still forfeits as soon as he gets out of book. (To speed things up you could write an engine that resigns immediately in resonse to 'go'.)
-
Michel
- Posts: 2271
- Joined: Mon Sep 29, 2008 1:50 am
Re: sampling a polyglot book ?
What I defined as a "line" in my post above (bookmoves for one player, arbitrary moves for the other player, end=bookmove) is the right thing I think for books which are meant to be used as repertoires (like performance.bin). Using a book as repertoire means you are not assuming the opponent is using the same book (which will be the case if you are playing on a server for example).But in that case, what would be meant by a representative sample from the book?
Of course for a tournament book the criteria are different. Perhaps repertoire books should not be used in tournaments since they could conceivably skew the results (except in book tournaments of course).
-
hgm
- Posts: 27701
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: sampling a polyglot book ?
OK, this is one definition, but probably not what the OP had in mind. If you want to make a sampling for the purpose of using it as start positions for a tourney, you don't want the side that has no book moves to play as a random mover. He must do reasonable moves, or most positions you get would already be decided in favor of the book side.
Best would actually be to use an engine, or a group of engines that do have an own book, and let the dummy play with the Polyglot book as GUI book.
Best would actually be to use an engine, or a group of engines that do have an own book, and let the dummy play with the Polyglot book as GUI book.
-
Michel
- Posts: 2271
- Joined: Mon Sep 29, 2008 1:50 am
Re: sampling a polyglot book ?
Not as a random mover since he is only allowed to move to positions in the book (if there is no such position the line ends right there). This is how human opening repertoires are written.you don't want the side that has no book moves to play as a random mover
But in retrospect while I think this is the right definition for a "line" I agree with you this is not a good way of sampling positions from the book since the opponent would still be allowed to make fairly bad moves.