Opening book database?

Discussion of chess software programming and technical issues.

Moderator: Ras

Steelman

Opening book database?

Post by Steelman »

I am working on an opening book. I wanted it to be simple and easy to test and debug. It needed to handle transpositions yet still be in a move list form. And I want to write the opening book. I figure if done correctly I can avoid openings my engine has difficulty with and steer the game in more open positions that suite my engine better.

So I took in Reuben Fines "Practical Chess Openings" the "Schematic Overview of the Openings" and created a simple file in the form of a move list.

Then wrote a program that will re-format the information in a readable output file. So now I can store the board positions.

The output file as as follows:

1,0 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq 0 root 1,1 26,1 39,1 40,1 41,1 42,1 43,1 44,1
1,1 rnbq1bnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq 40 e2-e4 1,2 20,2 21,2 22,2 23,2 24,2 25,2
1,2 rnbq1bnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 50 e7-e5 1,3 16,3 17,3 18,3 19,3
1,3 rnbq1bnr/pppp1ppp/8/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq 70 g1-f3 1,4 13,4 14,4 15,4
1,4 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 100 b8-c6 1,5 8,5 9,5 10,5 11,5
1,5 r1bq1bnr/pppp1ppp/2n5/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R b KQkq 50 f1-c4 1,6 6,6 7,6
1,6 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 50 f8-c5 1,7 2,7 3,7 4,7 5,7
1,7 r1bq2nr/pppp1ppp/2n5/2b1p3/1PB1P3/5N2/P1PP1PPP/RNBQK2R b KQkq 0 b2-b4
2,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/2P2N2/PP1P1PPP/RNBQK2R b KQkq 50 c2-c3
3,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/RNBQ1RK1 b KQkq 0 e1-g1
4,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/3P1N2/PPP2PPP/RNBQK2R b KQkq 50 d2-d3
5,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/2N2N2/PPPP1PPP/R1BQK2R b KQkq 0 b1-c3
6,6 r1bq1b1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 50 g8-f6
7,6 r1bq2nr/ppppbppp/2n5/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 0 f8-e7
8,5 r1bq1bnr/pppp1ppp/2n5/1B2p3/4P3/5N2/PPPP1PPP/RNBQK2R b KQkq 40 f1-b5
9,5 r1bq1bnr/pppp1ppp/2n5/4p3/3PP3/5N2/PPP2PPP/RNBQKB1R b KQkq 0 d2-d4
10,5 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/2P2N2/PP1P1PPP/RNBQKB1R b KQkq 10 c2-c3
11,5 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/2N2N2/PPPP1PPP/R1BQKB1R b KQkq 0 b1-c3 11,6 12,6
11,6 r1bq1b1r/pppp1ppp/2n2n2/4p3/4P3/2N2N2/PPPP1PPP/R1BQKB1R w KQkq 50 g8-f6
12,6 r1bq2nr/pppp1ppp/2n5/4p3/1b2P3/2N2N2/PPPP1PPP/R1BQKB1R w KQkq 50 f8-b4
13,4 rnbq1b1r/pppp1ppp/5n2/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 g8-f6
14,4 rnbq1bnr/ppp2ppp/3p4/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 d7-d6
15,4 rnbq1bnr/pppp2pp/5p2/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 f7-f6
16,3 rnbq1bnr/pppp1ppp/8/4p3/4P3/2N5/PPPP1PPP/R1BQKBNR b KQkq 0 b1-c3
17,3 rnbq1bnr/pppp1ppp/8/4p3/4PP2/8/PPPP2PP/RNBQKBNR b KQkq 0 f2-f4
18,3 rnbq1bnr/pppp1ppp/8/4p3/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 0 d2-d4
19,3 rnbq1bnr/pppp1ppp/8/4p3/2B1P3/8/PPPP1PPP/RNBQK1NR b KQkq 30 f1-c4
20,2 rnbq1bnr/pppp1ppp/4p3/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 5 e7-e6
21,2 rnbq1bnr/pp1ppppp/2p5/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 c7-c6
22,2 rnbq1bnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 45 c7-c5
23,2 rnbq1b1r/pppppppp/5n2/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 g8-f6
24,2 r1bq1bnr/pppppppp/2n5/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 b8-c6
25,2 rnbq1bnr/ppp1pppp/8/3p4/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 d7-d5
26,1 rnbq1bnr/pppppppp/8/8/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 40 d2-d4 26,2 36,2 37,2 38,2
26,2 rnbq1bnr/ppp1pppp/8/3p4/3PP3/8/PPP2PPP/RNBQKBNR w KQkq 50 d7-d5 26,3 32,3 33,3 34,3 35,3
26,3 rnbq1bnr/ppp1pppp/8/3p4/2PPP3/8/PP3PPP/RNBQKBNR b KQkq 50 c2-c4 26,4 27,4 28,4 29,4 30,4 31,4
26,4 rnbq1bnr/ppp1pppp/2p5/8/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 5 d5xc6
27,4 rnbq1bnr/ppp2ppp/4p3/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 95 e7-e6
28,4 rnbq1bnr/pp2pppp/2p5/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 c7-c6
29,4 rnbq1bnr/pp2pppp/8/2pp4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 c7-c5
30,4 rnbq1bnr/ppp2ppp/8/3pp3/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 e7-e5
31,4 r1bq1bnr/ppp1pppp/2n5/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 b8-c6
32,3 rnbq1bnr/ppp1pppp/8/3p4/3PP3/5N2/PPP2PPP/RNBQKB1R b KQkq 0 g1-f3
33,3 rnbq1bnr/ppp1pppp/8/3p4/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 0 e2-e4
34,3 rnbq1bnr/ppp1pppp/8/3p4/3P4/2N5/PPP1PPPP/R1BQKBNR b KQkq 0 b1-c3
35,3 rnbq1bnr/ppp1pppp/8/3p4/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq 0 e2-e3
36,2 rnbq1b1r/pppppppp/5n2/8/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 50 g8-f6
37,2 rnbq1bnr/pp1ppppp/8/2p5/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 0 c7-c5
38,2 rnbq1bnr/ppppp1pp/8/5p2/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 0 f7-f5
39,1 rnbq1bnr/pppppppp/8/8/2PP4/8/PP2PPPP/RNBQKBNR b KQkq 20 c2-c4
40,1 rnbq1bnr/pppppppp/8/8/3P4/5N2/PPP1PPPP/RNBQKB1R b KQkq 0 g1-f3
41,1 rnbq1bnr/pppppppp/8/8/3P1P2/8/PPP1P1PP/RNBQKBNR b KQkq 0 f2-f4
42,1 rnbq1bnr/pppppppp/8/8/3P4/2N5/PPP1PPPP/R1BQKBNR b KQkq 0 b1-c3
43,1 rnbq1bnr/pppppppp/8/8/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq 0 e2-e3
44,1 rnbq1bnr/pppppppp/8/8/3P4/2P5/PP2PPPP/RNBQKBNR b KQkq 0 c2-c3

The first row is the board start position (root)
Each row is in the format:
row,col "fen code" percent move "move selection"

row,col = where the move was in the input file.
fen code = the current board position fen code
percent = the percentage the move choice should be played
move = the possible move choice
move_selection = the following move choice locations (row,col)

Now I can add more opening information and re-built the output file.
Any comments or thoughts?
Harald Johnsen

Re: Opening book database?

Post by Harald Johnsen »

1) This is a common book format, note that you can (should) convert that in a binary format; then if you use the hash key that match the fen you will have the exact same format of the polyglot book format. Also your entries should be sorted by the position (fen or hash key) for a direct access.

2) If you use a chess book to build your opponing book then you must be sure that the source allows re-distribution of derived work (for example old public domain books or games pgn files).

HJ.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Opening book database?

Post by Michel »

polyglot book format.
I would strongly recommend the polyglot book format.

I think it is important that GUI's, adapters and chess engines agree on a common open opening book format (rather than proprietary stuff like ctg or abk).

The polyglot format has the advantage that it is easy to parse (see the polyglot source code) and already understood by a number of programs (polyglot, Fruit, Toga, Glaurung, possibly others).

If your engine is GPL you can even take the parse code directly from polyglot.

Regards,
Michel
jswaff

Re: Opening book database?

Post by jswaff »

My only comment is to make sure you can query your database without having to do a sequential search through the whole thing-- or at least don't search through the entire file sequentially (maybe in RAM would be OK). As your book becomes larger and larger you'll starting noticing the query time will become noticeable.

If you could arrange a binary search you'll be in good shape. IIRC, the Crafty approach is to break the book into many segments (I think 2^16 segments), and it would sequentially loop through just one of those segments, which is manageable.

--
James

Steelman wrote:I am working on an opening book. I wanted it to be simple and easy to test and debug. It needed to handle transpositions yet still be in a move list form. And I want to write the opening book. I figure if done correctly I can avoid openings my engine has difficulty with and steer the game in more open positions that suite my engine better.

So I took in Reuben Fines "Practical Chess Openings" the "Schematic Overview of the Openings" and created a simple file in the form of a move list.

Then wrote a program that will re-format the information in a readable output file. So now I can store the board positions.

The output file as as follows:

1,0 rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq 0 root 1,1 26,1 39,1 40,1 41,1 42,1 43,1 44,1
1,1 rnbq1bnr/pppppppp/8/8/4P3/8/PPPP1PPP/RNBQKBNR b KQkq 40 e2-e4 1,2 20,2 21,2 22,2 23,2 24,2 25,2
1,2 rnbq1bnr/pppp1ppp/8/4p3/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 50 e7-e5 1,3 16,3 17,3 18,3 19,3
1,3 rnbq1bnr/pppp1ppp/8/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R b KQkq 70 g1-f3 1,4 13,4 14,4 15,4
1,4 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 100 b8-c6 1,5 8,5 9,5 10,5 11,5
1,5 r1bq1bnr/pppp1ppp/2n5/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R b KQkq 50 f1-c4 1,6 6,6 7,6
1,6 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 50 f8-c5 1,7 2,7 3,7 4,7 5,7
1,7 r1bq2nr/pppp1ppp/2n5/2b1p3/1PB1P3/5N2/P1PP1PPP/RNBQK2R b KQkq 0 b2-b4
2,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/2P2N2/PP1P1PPP/RNBQK2R b KQkq 50 c2-c3
3,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/5N2/PPPP1PPP/RNBQ1RK1 b KQkq 0 e1-g1
4,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/3P1N2/PPP2PPP/RNBQK2R b KQkq 50 d2-d3
5,7 r1bq2nr/pppp1ppp/2n5/2b1p3/2B1P3/2N2N2/PPPP1PPP/R1BQK2R b KQkq 0 b1-c3
6,6 r1bq1b1r/pppp1ppp/2n2n2/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 50 g8-f6
7,6 r1bq2nr/ppppbppp/2n5/4p3/2B1P3/5N2/PPPP1PPP/RNBQK2R w KQkq 0 f8-e7
8,5 r1bq1bnr/pppp1ppp/2n5/1B2p3/4P3/5N2/PPPP1PPP/RNBQK2R b KQkq 40 f1-b5
9,5 r1bq1bnr/pppp1ppp/2n5/4p3/3PP3/5N2/PPP2PPP/RNBQKB1R b KQkq 0 d2-d4
10,5 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/2P2N2/PP1P1PPP/RNBQKB1R b KQkq 10 c2-c3
11,5 r1bq1bnr/pppp1ppp/2n5/4p3/4P3/2N2N2/PPPP1PPP/R1BQKB1R b KQkq 0 b1-c3 11,6 12,6
11,6 r1bq1b1r/pppp1ppp/2n2n2/4p3/4P3/2N2N2/PPPP1PPP/R1BQKB1R w KQkq 50 g8-f6
12,6 r1bq2nr/pppp1ppp/2n5/4p3/1b2P3/2N2N2/PPPP1PPP/R1BQKB1R w KQkq 50 f8-b4
13,4 rnbq1b1r/pppp1ppp/5n2/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 g8-f6
14,4 rnbq1bnr/ppp2ppp/3p4/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 d7-d6
15,4 rnbq1bnr/pppp2pp/5p2/4p3/4P3/5N2/PPPP1PPP/RNBQKB1R w KQkq 0 f7-f6
16,3 rnbq1bnr/pppp1ppp/8/4p3/4P3/2N5/PPPP1PPP/R1BQKBNR b KQkq 0 b1-c3
17,3 rnbq1bnr/pppp1ppp/8/4p3/4PP2/8/PPPP2PP/RNBQKBNR b KQkq 0 f2-f4
18,3 rnbq1bnr/pppp1ppp/8/4p3/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 0 d2-d4
19,3 rnbq1bnr/pppp1ppp/8/4p3/2B1P3/8/PPPP1PPP/RNBQK1NR b KQkq 30 f1-c4
20,2 rnbq1bnr/pppp1ppp/4p3/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 5 e7-e6
21,2 rnbq1bnr/pp1ppppp/2p5/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 c7-c6
22,2 rnbq1bnr/pp1ppppp/8/2p5/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 45 c7-c5
23,2 rnbq1b1r/pppppppp/5n2/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 g8-f6
24,2 r1bq1bnr/pppppppp/2n5/8/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 b8-c6
25,2 rnbq1bnr/ppp1pppp/8/3p4/4P3/8/PPPP1PPP/RNBQKBNR w KQkq 0 d7-d5
26,1 rnbq1bnr/pppppppp/8/8/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 40 d2-d4 26,2 36,2 37,2 38,2
26,2 rnbq1bnr/ppp1pppp/8/3p4/3PP3/8/PPP2PPP/RNBQKBNR w KQkq 50 d7-d5 26,3 32,3 33,3 34,3 35,3
26,3 rnbq1bnr/ppp1pppp/8/3p4/2PPP3/8/PP3PPP/RNBQKBNR b KQkq 50 c2-c4 26,4 27,4 28,4 29,4 30,4 31,4
26,4 rnbq1bnr/ppp1pppp/2p5/8/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 5 d5xc6
27,4 rnbq1bnr/ppp2ppp/4p3/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 95 e7-e6
28,4 rnbq1bnr/pp2pppp/2p5/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 c7-c6
29,4 rnbq1bnr/pp2pppp/8/2pp4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 c7-c5
30,4 rnbq1bnr/ppp2ppp/8/3pp3/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 e7-e5
31,4 r1bq1bnr/ppp1pppp/2n5/3p4/2PPP3/8/PP3PPP/RNBQKBNR w KQkq 0 b8-c6
32,3 rnbq1bnr/ppp1pppp/8/3p4/3PP3/5N2/PPP2PPP/RNBQKB1R b KQkq 0 g1-f3
33,3 rnbq1bnr/ppp1pppp/8/3p4/3PP3/8/PPP2PPP/RNBQKBNR b KQkq 0 e2-e4
34,3 rnbq1bnr/ppp1pppp/8/3p4/3P4/2N5/PPP1PPPP/R1BQKBNR b KQkq 0 b1-c3
35,3 rnbq1bnr/ppp1pppp/8/3p4/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq 0 e2-e3
36,2 rnbq1b1r/pppppppp/5n2/8/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 50 g8-f6
37,2 rnbq1bnr/pp1ppppp/8/2p5/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 0 c7-c5
38,2 rnbq1bnr/ppppp1pp/8/5p2/3P4/8/PPP1PPPP/RNBQKBNR w KQkq 0 f7-f5
39,1 rnbq1bnr/pppppppp/8/8/2PP4/8/PP2PPPP/RNBQKBNR b KQkq 20 c2-c4
40,1 rnbq1bnr/pppppppp/8/8/3P4/5N2/PPP1PPPP/RNBQKB1R b KQkq 0 g1-f3
41,1 rnbq1bnr/pppppppp/8/8/3P1P2/8/PPP1P1PP/RNBQKBNR b KQkq 0 f2-f4
42,1 rnbq1bnr/pppppppp/8/8/3P4/2N5/PPP1PPPP/R1BQKBNR b KQkq 0 b1-c3
43,1 rnbq1bnr/pppppppp/8/8/3P4/4P3/PPP2PPP/RNBQKBNR b KQkq 0 e2-e3
44,1 rnbq1bnr/pppppppp/8/8/3P4/2P5/PP2PPPP/RNBQKBNR b KQkq 0 c2-c3

The first row is the board start position (root)
Each row is in the format:
row,col "fen code" percent move "move selection"

row,col = where the move was in the input file.
fen code = the current board position fen code
percent = the percentage the move choice should be played
move = the possible move choice
move_selection = the following move choice locations (row,col)

Now I can add more opening information and re-built the output file.
Any comments or thoughts?
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Opening book database?

Post by Michel »

IIRC, the Crafty approach is to break the book into many segments (I think 2^16 segments), and it would sequentially loop through just one of those segments, which is manageable.
In a polyglot book the positions are ordered according to their (Zobrist) hash key. To find a particular key you can do a binary search. This goes quite fast. For example: the opening book "performance.bin" by Marc Lacrosse has 92954 entries. So a binary search takes 17 probes.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Opening book database?

Post by bob »

Michel wrote:
polyglot book format.
I would strongly recommend the polyglot book format.

I think it is important that GUI's, adapters and chess engines agree on a common open opening book format (rather than proprietary stuff like ctg or abk).

The polyglot format has the advantage that it is easy to parse (see the polyglot source code) and already understood by a number of programs (polyglot, Fruit, Toga, Glaurung, possibly others).

If your engine is GPL you can even take the parse code directly from polyglot.

Regards,
Michel
It also has the characteristic that important features are missing. Places to store information such as learning, as well as information about frequency of play, win/lose/draw results, etc.

A "global format" would be nice, but it needs to cover _all_ bases or else it is worthless...
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Opening book database?

Post by Michel »

Places to store information such as learning, as well as information about frequency of play, win/lose/draw results, etc.
Well it has a scoring system and facilities for learning (which I have not looked at).

The advantage of the polyglot format is that it exists(!), it is open(!), it is trivial to parse, it is understood by a number of programs (including the ubiquitous polyglot) and used by a number of opening books.
User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 7:45 pm
Location: Finland

Re: Opening book database?

Post by ilari »

Michel wrote:I would strongly recommend the polyglot book format.
Do you know where I can find the documentation (specification) for the Polyglot book format? If we're going to have a standard, then it's important that it's well documented.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Opening book database?

Post by Michel »

Do you know where I can find the documentation (specification) for the Polyglot book format?
It can be obtained from the polyglot source code. However I'll put a description of the file format on my website. The main thing is the table with random numbers to compute the Zobrist hash key.
User avatar
ilari
Posts: 750
Joined: Mon Mar 27, 2006 7:45 pm
Location: Finland

Re: Opening book database?

Post by ilari »

Michel wrote:It can be obtained from the polyglot source code. However I'll put a description of the file format on my website. The main thing is the table with random numbers to compute the Zobrist hash key.
That would be great, thanks a lot. Proper documentation will definitely increase the adoption of any file format.