Opening Book Writing Bug in Faile 4.1 (Maybe?...)

Discussion of chess software programming and technical issues.

Moderator: Ras

munchkin

Opening Book Writing Bug in Faile 4.1 (Maybe?...)

Post by munchkin »

I've been examining the code that Faile 1.4 uses for creating an opening book from a PGN file. (It has been a very rewarding experience.)

However, I think I may have found a bug in the code where the actual opening book is written. To make things short, Faile writes a 64-bit hash for each position along with an unsigned short int, called .freq, that counts the frequency, or the number of times a position occurs.

When it is time to write the actual data to the opening book file, it uses this code:

Code: Select all

void write_book (void) {

  /* write our book to the binary file faile.obk */
...
  /* write our book: */
  for (i = 0; i <= b_hash_mask; i++) {
    b_hash = b_hash_table[i];
    if (b_hash.freq > 1) {
      counter++;
      fwrite (&b_hash, sizeof (b_hash_s), 1, book_out);
    }
  }
...
I don't understand the line:

Code: Select all

if (b_hash.freq > 1) {
Doesn't this mean that only positions that occur multiple times will be written to the book? So you could pass it a very small PGN file that contained only these two short games:

1. e4 e5 2. Nf3 Nf6 3. d4
0-1

1. e4 e5 2. Nf3 Nf6 3. Nxe5
0-1

... and Faile will only write the positions that correspond to the first 4 ply of each game -- the last move of each of these games, 3. d4 and 3. Nxe5, will not be written to the book. (I've tested this so I know that it is true.)

If this is not a bug, then why would it be considered desirable to eliminate all moves that only occur once from the book? (Yes, I can certainly see that this would reduce the physical size of the book file, but that seems like a poor excuse for cutting short so many lines.)

Is there some other reason to do this? Or does this indeed look like a bug?

The reason why I ask is that, if I were to implement my own opening book, I'd want to use the '?' as a signal to my program not to play a move so marked. But, you'd still want to be able to indicate a refutation line after the move, so that it is available to your program. Something like this:

1. f4 e6 2. g4? Qh4#
0-1

But if you eliminate all positions that only occur once from your opening book, then you wouldn't be able to store the refutation line. (Faile has no mechanism for marking such moves anyway, but still....)
User avatar
pedrox
Posts: 1056
Joined: Fri Mar 10, 2006 6:07 am
Location: Basque Country (Spain)

Re: Opening Book Writing Bug in Faile 4.1 (Maybe?...)

Post by pedrox »

Faile seems to have a simple method to decide which moves are added to the opening book, just add moves that have occurred at least two times to that position in the PGN file and seems to not control what has been the result of the match.

If you include moves that occur only once have more possibilities to include bad moves, so Faile put the condition of at least 2 moves.

But without knowing the result of the game this is not a good idea because even though this situation has occurred 2 times does not mean that the move is good, the player who made the move may have lost the game or even if he've won it may happen that he is a player with a low value of Elo and therefore this move has less value.

For Faile, I would rate at a higher value, perhaps a minimum of 10 to a database of over 1,000,000 games (depends on the quality of the game contents in the file PGN), the opening book will be smaller but will have less errors.

You can study the Polyglot format, is a superior format.
User avatar
pedrox
Posts: 1056
Joined: Fri Mar 10, 2006 6:07 am
Location: Basque Country (Spain)

Re: Opening Book Writing Bug in Faile 4.1 (Maybe?...)

Post by pedrox »