Header format for polyglot books

Discussion of chess software programming and technical issues.

Moderator: Ras

Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Disregarding 0x0 does not solve the problem though, as it only avoids an extremely small fraction of the possible collisions. You would still have problems in the following position for example:
0x0 is only for the header. The other hash collisions were always present
(and not viewed as a problem either).
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Header format for polyglot books

Post by Don »

petero2 wrote:
Michel wrote:
This is just nitpicking though. I don't believe this header format will cause problems in practice. If you are worried about hash collisions, you should not use the polyglot book format.
Exactly. But _if_ you are worried then you can just regard 0x0 as invalid for book lookup.
Disregarding 0x0 does not solve the problem though, as it only avoids an extremely small fraction of the possible collisions.
The idea is not to avoid collisions, just to avoid collisions with key = 0. We don't want an opening position coming up with key 0 and the program crashes because it cannot make sense out of the polyglot record.

One solution is to make the software ignore key zero. Unmodified software that reads polyglot books could stumble on this record - but it should be very unlikely. To be sure you want to review your software. In my case Komodo can read polyglot books so I might want to review the code to make sure nothing bad happens if a position just happens to hash to zero.

Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.


You would still have problems in the following position for example:
[d]3rk1nq/1p1p4/P1p3p1/1p2p3/3PP3/6P1/1PP4P/R2QK1R1 w - - 0 1
Some other positions that have the same hash as the initial position:

Code: Select all

3rkn1b/p1p4p/3p4/1Pp3p1/PP3p1p/5P2/3P4/1N1RKNBB w - - 0 1
2b1kqrn/5p2/3pp3/5P1p/3pp3/1P2P2P/2PPP3/1QB1KR2 w - - 0 1
1r2kn1b/2p1p3/P7/P3p1p1/1P4Pp/5P2/P3P1P1/BR2KR1Q w - - 0 1
qr1nkrnb/5p2/8/p4P1P/P1p1P2p/8/P1P1P2P/RNN1KRQ1 w - - 0 1
rqn1kn2/2p5/3p4/1P1p4/6P1/pP3P2/1PP3PP/1R1BK1N1 w - - 0 1
nr1rkq2/p7/1pp2p2/6p1/3PpP2/1PP5/2P1P3/BBNRKNQR w - - 0 1
b2bknqr/3p3p/p6p/7P/1p1p1p1P/P1P2P2/1P6/1B2KB2 w - - 0 1
1bbnk2q/4p2p/2p2pp1/p2pPP2/1P2P3/7P/8/QBRNK1N1 w - - 0 1
2b1kqr1/p2p3p/3p4/p2PpP2/PpP2p2/6P1/8/RRB1KQ1N w - - 0 1
2qrkn1r/3p1p2/8/2Pp1p1p/5P2/2PP4/1P2P1PP/1RR1K1N1 w - - 0 1
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.
Yes. Fortunately as far as I know Polyglot is the only software that
works with polyglot books globally (in it merge-book utility). As of version 1.4.70b polyglot merge-book will treat the header correctly.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Header format for polyglot books

Post by Don »

Michel wrote:
Also, software that sorts the records has to make sure that records with key zero are not shuffled. So you either need a stable sort or else special case for key = 0.
Yes. Fortunately as far as I know Polyglot is the only software that
works with polyglot books globally (in it merge-book utility). As of version 1.4.70b polyglot merge-book will treat the header correctly.
Yes. Actually it should be no problem if you are building new software or modifying existing software to use the new Polyglot extension as you should be aware of the issue anyway. But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
I understand your point. That is why I am saying that it is fortunate that the Polyglot merge-book utility is the only software I know that does this. As Polyglot is still maintained it is not a problem to get a recent version.

merge-book with an old Polyglot will indeed mess up the header of the resulting book (it will merge the header records in some random way, resulting in an invalid header).

This does not break the resulting book however, and you can easily delete the header and create a new one.
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Header format for polyglot books

Post by Don »

Michel wrote:
But OLD software that might take a polyglot book and massage it in some way could inadvertently mess these header records up.
I understand your point. That is why I am saying that it is fortunate that the Polyglot merge-book utility is the only software I know that does this. As Polyglot is still maintained it is not a problem to get a recent version.

merge-book with an old Polyglot will indeed mess up the header of the resulting book (it will merge the header records in some random way, resulting in an invalid header).

This does not break the resulting book however, and you can easily delete the header and create a new one.
Does the new Polyglot stuff ignore any position that might happen to hash to zero? It's probably not an issue if the records look like they have illegal moves in them and the software tests for this.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Does the new Polyglot stuff ignore any position that might happen to hash to zero?
Do you mean when functioning as a book engine or adapter? Yes in those modes it now ignores 0x0 keys (it is just a one line change of course).
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

I created version 1.0.1 of pgheader. This version does much stricter checking. Hopefully that will make it more difficult (impossible?) to create non-compliant headers.
I also added some small clarifications to the spec.

Link: http://hardy.uhasselt.be/Toga/pgheader-release/
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Somebody on the winboard forum remarked that it would be nice if the comment section of the header would be able to contain non-english characters (and things like chess pieces, the euro sign, etc...).

To my surprise this is already transparently supported by the format.

Code: Select all

$ ./pgheader book.bin -c 国际象棋引擎
Recall that this command adds a comment to the header. The Chinese comment means "chess engine" according to google translate (probably a ridiculous translation).

We now inspect that the comment has indeed been added.

Code: Select all

$ ./pgheader book.bin -s
Variants supported:
normal
Comment:
国际象棋引擎
This is the actual header data

Code: Select all

$ ./pgheader book.bin -S
    @    P    G    @   \n    1    .    0
   \n    2   \n    1   \n    n    o    r
    m    a    l   \n \345 \233 \275 \351
 \231 \205 \350 \261 \241 \346 \243 \213
 \345 \274 \225 \346 \223 \216   \0   \0
One can see that there are bytes with their high bit set. These belong to multi-byte characters (UTF-8 encoded).

Needless to say this stuff only applies to the comment section. The section with predefined fields is required to consist only of printable ascii characters.