I did not fully understand the format, but after looking more closely I thing the proposal is a good one. I would suggest that all programs from now on consider "zero key" records to be invalid to avoid problems.Don wrote:The probability is quite small but the point is that it could in fact produce an error. Of course the current polyglot format has the same issue with EVERY record, it's possible that a collision will occur and the move will also be valid. So what is so special about key zero in this respect?mcostalba wrote:How can you be sure that an opening position doesn't have zero key ? IOW what about hash collisions ?Michel wrote: Please feel free to comment.
P.S: In case someone plans to answer something along the lines of "the probabilty of this is very very small", "doens't happens in practice", etc please do me a favor and be so kind to post instead "collision avoidance is not guaranteed". Thanks in advance.
P.P.S: Giving the above, perhaps the "move" field should keep a not valid move value, for instance where from square equals destination square.
The best solution of course is to make sure the move field is zero too or some other invalid move.
Header format for polyglot books
Moderator: Ras
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Header format for polyglot books
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 28384
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Header format for polyglot books
One could argue that including code to test for move legality is a bad idea, because it only drives up the probability of malfunctioning. Because the chances that the extra code required to do the checking will cause a crash, because it is hit by a cosmic ray or some other natural disaster in the cases that there is no collision is so much larger than the probability that there will be a collission.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Header format for polyglot books
Legal checking can be seen as roughly equivalent to adding "bits" of extra key. So it would be a benefit for all positions, not just the "zero record."hgm wrote:One could argue that including code to test for move legality is a bad idea, because it only drives up the probability of malfunctioning. Because the chances that the extra code required to do the checking will cause a crash, because it is hit by a cosmic ray or some other natural disaster in the cases that there is no collision is so much larger than the probability that there will be a collission.
How many bits of extra key equivalent is move verification? I'll bet it's not much, perhaps 2 or 3 bits at most.
What is the probability of a collision in a 2 million position polyglot book?
Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
-
- Posts: 28384
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Header format for polyglot books
The probability that a probe for a position not in the book would turn up a false hit on a book position would be 2M/2^64 = 2^21/2^64 = 1/2^53 ~ 1e-16. Adding the header without guaranteeing the move field would be invalid would drive this up to 1.0000005e-16.
-
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Header format for polyglot books
I refactored the utility for handling pgheaders and added some regression tests. The utility is now available here
http://hardy.uhasselt.be/Toga/pgheader.tar.gz
The api is contained in the source files pgheader.h and pgheader.c. pgheader.h provides the following functions
pgheader --help yields the following output
http://hardy.uhasselt.be/Toga/pgheader.tar.gz
The api is contained in the source files pgheader.h and pgheader.c. pgheader.h provides the following functions
Code: Select all
int pgheader_create (char **header, const char *variants, const char *comment);
int pgheader_read (char **header, const char *infile);
int pgheader_write (const char *header, const char *infile, const char *outfile);
int pgheader_delete (const char *infile, const char *outfile);
void pgheader_error (const char *prompt, int ret);
Code: Select all
pgheader add <options>
Add a header
Options:
-i, --in <file> input file
-o, --out <file> output file (default: book.bin)
-v, --variants <variants> comma separated list of supported variants (default: normal)
-c, --comment <comment> free format string, may contain newlines encoded as two char strings \n
(default: "")
pgheader show <options>
Print the header
Options:
-i, --in <file> input file
pgheader delete <options>
Delete the header
Options:
-i, --in <file> input file
-o, --out <file> output file (default: book.bin)
pgheader test
Run regression tests
-
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Header format for polyglot books
The gnuchess book is the first available book with the new header!
Code: Select all
http://hardy.uhasselt.be/Toga/gnuchess-release/gnuchess.bin
Code: Select all
$ file gnuchess.bin
gnuchess.bin: Polyglot chess opening book (version 1.0)
$ pgheader show -i gnuchess.bin
@PG@
1.0
1
normal
Created by GnuChess.
-
- Posts: 725
- Joined: Mon Apr 19, 2010 7:07 pm
- Location: Sweden
- Full name: Peter Osterlund
Re: Header format for polyglot books
But the point is not valid, because the empty board is not the only position that hashes to 0. Another example isMichel wrote:Nice point! Luckily it will not cause a hash collision in other variants either since for non FIDE chess the keys are xor'ed with a variant key which is non-zero.IIRC, the 0x0 magic maps to
[d]8/8/8/8/8/8/8/8 w - - 0 1
which would never happen in a FIDE-compliant game of chess but I have no idea about other variants.
[d]2b1k3/4p3/3p1p2/p2P2p1/P2P4/2P2PP1/4P3/2NQKB2 b - - 0 1
This is just nitpicking though. I don't believe this header format will cause problems in practice. If you are worried about hash collisions, you should not use the polyglot book format.
-
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Header format for polyglot books
But the point is not valid, because the empty board is not the only position that hashes to 0. Another example is
[d]2b1k3/4p3/3p1p2/p2P2p1/P2P4/2P2PP1/4P3/2NQKB2 b - - 0 1
Wonderful!!! I was not aware that positions that hashed to 0x0 were explicitly known. And it is even a position that looks vaguely reasonable.
[d]2b1k3/4p3/3p1p2/p2P2p1/P2P4/2P2PP1/4P3/2NQKB2 b - - 0 1
Wonderful!!! I was not aware that positions that hashed to 0x0 were explicitly known. And it is even a position that looks vaguely reasonable.
Exactly. But _if_ you are worried then you can just regard 0x0 as invalid for book lookup.This is just nitpicking though. I don't believe this header format will cause problems in practice. If you are worried about hash collisions, you should not use the polyglot book format.
-
- Posts: 28384
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Header format for polyglot books
How did you get this position? Exhaustive search for 'reasonable positions'? (With in each file white-pieces, white pawns, black pawns, black pieces (all zero or more) in that order? Or did you confine pieces to the back rank (and Kings to the e-file)?)
I guess one could make a table of hash keys of all reasonable Pawn stuctures (or as may as fit in memory) and then exhaustively generate piece-only keys until you match one of the Pawn keys. If you can store 0.5G Pawn keys, you would on average have to generate 32G piece-only keys to get a collision, which sounds doable.
I guess one could make a table of hash keys of all reasonable Pawn stuctures (or as may as fit in memory) and then exhaustively generate piece-only keys until you match one of the Pawn keys. If you can store 0.5G Pawn keys, you would on average have to generate 32G piece-only keys to get a collision, which sounds doable.
Last edited by hgm on Sun Sep 23, 2012 4:07 pm, edited 1 time in total.
-
- Posts: 5106
- Joined: Tue Apr 29, 2008 4:27 pm
Re: Header format for polyglot books
Maybe we can get Steven Edwards to run perft 14 next - and modify his software to report polyglot keys that are zero! If there are none, then we will at least know we can use this system up to 14 ply without breaking any software.hgm wrote:How did you get this position? Exhaustive search for 'reasonable positions'? (With in each file white-pieces, white pawns, black pawns, black pieces (all zero or more) in that order?)
Don
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.