Disregarding 0x0 does not solve the problem though, as it only avoids an extremely small fraction of the possible collisions. You would still have problems in the following position for example:
0x0 is only for the header. The other hash collisions were always present
(and not viewed as a problem either).
This is just nitpicking though. I don't believe this header format will cause problems in practice. If you are worried about hash collisions, you should not use the polyglot book format.
Exactly. But _if_ you are worried then you can just regard 0x0 as invalid for book lookup.
Disregarding 0x0 does not solve the problem though, as it only avoids an extremely small fraction of the possible collisions.
The idea is not to avoid collisions, just to avoid collisions with key = 0. We don't want an opening position coming up with key 0 and the program crashes because it cannot make sense out of the polyglot record.
One solution is to make the software ignore key zero. Unmodified software that reads polyglot books could stumble on this record - but it should be very unlikely. To be sure you want to review your software. In my case Komodo can read polyglot books so I might want to review the code to make sure nothing bad happens if a position just happens to hash to zero.
Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.
You would still have problems in the following position for example:
[D]3rk1nq/1p1p4/P1p3p1/1p2p3/3PP3/6P1/1PP4P/R2QK1R1 w - - 0 1
Some other positions that have the same hash as the initial position:
Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.
Yes. Fortunately as far as I know Polyglot is the only software that
works with polyglot books globally (in it merge-book utility). As of version 1.4.70b polyglot merge-book will treat the header correctly.
Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.
Yes. Fortunately as far as I know Polyglot is the only software that
works with polyglot books globally (in it merge-book utility). As of version 1.4.70b polyglot merge-book will treat the header correctly.
Yes. Actually it should be no problem if you are building new software or modifying existing software to use the new Polyglot extension as you should be aware of the issue anyway. But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
I understand your point. That is why I am saying that it is fortunate that the Polyglot merge-book utility is the only software I know that does this. As Polyglot is still maintained it is not a problem to get a recent version.
merge-book with an old Polyglot will indeed mess up the header of the resulting book (it will merge the header records in some random way, resulting in an invalid header).
This does not break the resulting book however, and you can easily delete the header and create a new one.
But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
I understand your point. That is why I am saying that it is fortunate that the Polyglot merge-book utility is the only software I know that does this. As Polyglot is still maintained it is not a problem to get a recent version.
merge-book with an old Polyglot will indeed mess up the header of the resulting book (it will merge the header records in some random way, resulting in an invalid header).
This does not break the resulting book however, and you can easily delete the header and create a new one.
Does the new Polyglot stuff ignore any position that might happen to hash to zero? It's probably not an issue if the records look like they have illegal moves in them and the software tests for this.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
I created version 1.0.1 of pgheader. This version does much stricter checking. Hopefully that will make it more difficult (impossible?) to create non-compliant headers.
I also added some small clarifications to the spec.
Somebody on the winboard forum remarked that it would be nice if the comment section of the header would be able to contain non-english characters (and things like chess pieces, the euro sign, etc...).
To my surprise this is already transparently supported by the format.
Recall that this command adds a comment to the header. The Chinese comment means "chess engine" according to google translate (probably a ridiculous translation).
We now inspect that the comment has indeed been added.
$ ./pgheader book.bin -S
@ P G @ \n 1 . 0
\n 2 \n 1 \n n o r
m a l \n \345 \233 \275 \351
\231 \205 \350 \261 \241 \346 \243 \213
\345 \274 \225 \346 \223 \216 \0 \0
One can see that there are bytes with their high bit set. These belong to multi-byte characters (UTF-8 encoded).
Needless to say this stuff only applies to the comment section. The section with predefined fields is required to consist only of printable ascii characters.