Header format for polyglot books

hgm · Post by **hgm** » Sat Sep 15, 2012 5:13 pm

Never knew that nonsense was a language.

The point is that it is simply not true that "an invalid move would not be tried by engines". Why would engines test for validity of the book move? Why would they not crash on an invalid book move?

mcostalba · Post by **mcostalba** » Sat Sep 15, 2012 5:18 pm

hgm wrote:Never knew that nonsense was a language.

The point is that it is simply not true that "an invalid move would not be tried by engines". Why would engines test for validity of the book move? Why would they not crash on an invalid book move?

Becuase, as you (amazingly) said: hash collisions are possible. So a proper implemented book reader should test for move legality before to try it.

P.S: I am not a fan of "add crap to workaround crap" software design philosphy, so please don't tell me there exsist broken code out there. Thanks.

ZirconiumX · Post by **ZirconiumX** » Sat Sep 15, 2012 10:05 pm

@PG@ is 01000000010100000100011101000000.

The move would be in LAN c1a6 - which is not a legal move - especially not on an empty board.

Matthew:out

Evert · Post by **Evert** » Sun Sep 16, 2012 9:03 am

I appreciate that you're trying to make a backward compatible extension to a commonly used format her, but to me it honestly looks like a big hack that will not scale well and introduce more problems in the future. The reason is you're essentially relying on unspecified behaviour from existing implementations to make this work: what happens if a book editor deletes some of the header keys? Or scrambles them because it uses an unstable sort on the keys?

In my opinion it's a better idea to write a specification for "extended" polyglot books that includes a header that can be extended in a backward compatible way in the future (say, the header includes the size of the header so we can always skip it if we want to, even when dealing with a header that contains meta data we don't understand). If the magic header is not present (an old polyglot book) we skip 0 bytes from the beginning of the file.

Just my opinion of course.

Michel · Post by **Michel** » Sun Sep 16, 2012 9:20 am

I appreciate that you're trying to make a backward compatible extension to a commonly used format her, but to me it honestly looks like a big hack that will not scale well and introduce more problems in the future. The reason is you're essentially relying on unspecified behaviour from existing implementations to make this work: what happens if a book editor deletes some of the header keys? Or scrambles them because it uses an unstable sort on the keys?

I don't understand you criticism. The "extended specification" is simply that book making/using programs should ignore records with null keys (unless they want to use/create the metadata).

But the point is that programs using polyglot books which are not aware of the header will still function correctly.

mcostalba · Post by **mcostalba** » Sun Sep 16, 2012 9:45 am

Yes, the extended format does not seem 100% backward compatible becuase readres that don't know about the header will get it wrong.

Anyhow the point about reordering is sensible IMHO. I'd suggest to use increasing keys for the headers, not just zero but 0x0, 0x1, ....and so on so to prevent against reordering.

Michel · Post by **Michel** » Sun Sep 16, 2012 9:51 am

Yes, the extended format does not seem 100% backward compatible becuase readres that don't know about the header will get it wrong.

They will get it wrong with a probablility far less than the probability that you will get hit by a meteorite tomorrow.

Seriously if you are worried about hash collisions why is stockfish using polyglot books....?

Anyhow the point about reordering is sensible IMHO. I'd suggest to use increasing keys for the headers, not just zero but 0x0, 0x1, ....and so on so to prevent against reordering.

Obviously this would only increase the probability of hash collisions which you are so worried about.

Polyglot book merging will also mess up the header (but not in a way that programs using the resulting book will cease to function).

If you want to create a polyglot book with a non-header aware program you can add the header manually with my utility (which will delete the possibly messed up old header).

mcostalba · Post by **mcostalba** » Sun Sep 16, 2012 10:12 am

Michel wrote: Seriously if you are worried about hash collisions why is stockfish using polyglot books....?

Collisions are not a problem per se if properly handled, I think you missed some steps along the way, please reread the thread.

To summarize:

1) To make hash collisions harmless it is enough that the read 'move' (@PG@ in your case) is invalid in all cases.

2) To prevent against reordering is a good idea IMHO to use an increasing key value from 0 to the number of used headers.

Michel · Post by **Michel** » Sun Sep 16, 2012 10:25 am

Collisions are not a problem per se if properly handled, I think you missed some steps along the way, please reread the thread.

Collisions do not cause a crash if you do legality checking. But a move obtained through a hash collision may still be very bad even if it is legal and make you lose the game right away. This is exactly the same for a hash collision with a null record. But you seem to consistently ignore the point that this is a purely theoretical discussion. The probability of hash collisions occurring within a polyglot book are very small, but collisions occurring exactly with a null record are still much smaller.

To prevent against reordering is a good idea IMHO to use an increasing key value from 0 to the number of used headers.

This is pointless as book merging with a non header aware polyglot will still mess up the headers (but not in a way that programs using the resulting book will cease to function). Since it seems you have not read my previous post I repeat here what I wrote:

If you want to create a polyglot book with a non-header aware program you can add the header manually with my utility (which will delete the possibly messed up old header).

Don · Post by **Don** » Sun Sep 16, 2012 2:51 pm

mcostalba wrote:
Michel wrote: Please feel free to comment.
How can you be sure that an opening position doesn't have zero key ? IOW what about hash collisions ?

P.S: In case someone plans to answer something along the lines of "the probabilty of this is very very small", "doens't happens in practice", etc please do me a favor and be so kind to post instead "collision avoidance is not guaranteed". Thanks in advance.

P.P.S: Giving the above, perhaps the "move" field should keep a not valid move value, for instance where from square equals destination square.

The probability is quite small but the point is that it could in fact produce an error. Of course the current polyglot format has the same issue with EVERY record, it's possible that a collision will occur and the move will also be valid. So what is so special about key zero in this respect?

The best solution of course is to make sure the move field is zero too or some other invalid move.

Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books

Re: Header format for polyglot books