Header format for polyglot books

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
hgm
Posts: 28379
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Header format for polyglot books

Post by hgm »

Never knew that nonsense was a language. :wink:

The point is that it is simply not true that "an invalid move would not be tried by engines". Why would engines test for validity of the book move? Why would they not crash on an invalid book move?
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Header format for polyglot books

Post by mcostalba »

hgm wrote:Never knew that nonsense was a language. :wink:

The point is that it is simply not true that "an invalid move would not be tried by engines". Why would engines test for validity of the book move? Why would they not crash on an invalid book move?
Becuase, as you (amazingly) said: hash collisions are possible. So a proper implemented book reader should test for move legality before to try it.


P.S: I am not a fan of "add crap to workaround crap" software design philosphy, so please don't tell me there exsist broken code out there. Thanks.
ZirconiumX
Posts: 1359
Joined: Sun Jul 17, 2011 11:14 am
Full name: Hannah Ravensloft

Re: Header format for polyglot books

Post by ZirconiumX »

@PG@ is 01000000010100000100011101000000.

The move would be in LAN c1a6 - which is not a legal move - especially not on an empty board.

Matthew:out
tu ne cede malis, sed contra audentior ito
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: Header format for polyglot books

Post by Evert »

I appreciate that you're trying to make a backward compatible extension to a commonly used format her, but to me it honestly looks like a big hack that will not scale well and introduce more problems in the future. The reason is you're essentially relying on unspecified behaviour from existing implementations to make this work: what happens if a book editor deletes some of the header keys? Or scrambles them because it uses an unstable sort on the keys?

In my opinion it's a better idea to write a specification for "extended" polyglot books that includes a header that can be extended in a backward compatible way in the future (say, the header includes the size of the header so we can always skip it if we want to, even when dealing with a header that contains meta data we don't understand). If the magic header is not present (an old polyglot book) we skip 0 bytes from the beginning of the file.

Just my opinion of course.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

I appreciate that you're trying to make a backward compatible extension to a commonly used format her, but to me it honestly looks like a big hack that will not scale well and introduce more problems in the future. The reason is you're essentially relying on unspecified behaviour from existing implementations to make this work: what happens if a book editor deletes some of the header keys? Or scrambles them because it uses an unstable sort on the keys?
I don't understand you criticism. The "extended specification" is simply that book making/using programs should ignore records with null keys (unless they want to use/create the metadata).

But the point is that programs using polyglot books which are not aware of the header will still function correctly.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Header format for polyglot books

Post by mcostalba »

Yes, the extended format does not seem 100% backward compatible becuase readres that don't know about the header will get it wrong.

Anyhow the point about reordering is sensible IMHO. I'd suggest to use increasing keys for the headers, not just zero but 0x0, 0x1, ....and so on so to prevent against reordering.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Yes, the extended format does not seem 100% backward compatible becuase readres that don't know about the header will get it wrong.
They will get it wrong with a probablility far less than the probability that you will get hit by a meteorite tomorrow.

Seriously if you are worried about hash collisions why is stockfish using polyglot books....?
Anyhow the point about reordering is sensible IMHO. I'd suggest to use increasing keys for the headers, not just zero but 0x0, 0x1, ....and so on so to prevent against reordering.
Obviously this would only increase the probability of hash collisions which you are so worried about.

Polyglot book merging will also mess up the header (but not in a way that programs using the resulting book will cease to function).

If you want to create a polyglot book with a non-header aware program you can add the header manually with my utility (which will delete the possibly messed up old header).
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Header format for polyglot books

Post by mcostalba »

Michel wrote: Seriously if you are worried about hash collisions why is stockfish using polyglot books....?
Collisions are not a problem per se if properly handled, I think you missed some steps along the way, please reread the thread.

To summarize:

1) To make hash collisions harmless it is enough that the read 'move' (@PG@ in your case) is invalid in all cases.

2) To prevent against reordering is a good idea IMHO to use an increasing key value from 0 to the number of used headers.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Header format for polyglot books

Post by Michel »

Collisions are not a problem per se if properly handled, I think you missed some steps along the way, please reread the thread.


Collisions do not cause a crash if you do legality checking. But a move obtained through a hash collision may still be very bad even if it is legal and make you lose the game right away. This is exactly the same for a hash collision with a null record. But you seem to consistently ignore the point that this is a purely theoretical discussion. The probability of hash collisions occurring within a polyglot book are very small, but collisions occurring exactly with a null record are still much smaller.
To prevent against reordering is a good idea IMHO to use an increasing key value from 0 to the number of used headers.
This is pointless as book merging with a non header aware polyglot will still mess up the headers (but not in a way that programs using the resulting book will cease to function). Since it seems you have not read my previous post I repeat here what I wrote:
If you want to create a polyglot book with a non-header aware program you can add the header manually with my utility (which will delete the possibly messed up old header).
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Header format for polyglot books

Post by Don »

mcostalba wrote:
Michel wrote: Please feel free to comment.
How can you be sure that an opening position doesn't have zero key ? IOW what about hash collisions ?

P.S: In case someone plans to answer something along the lines of "the probabilty of this is very very small", "doens't happens in practice", etc please do me a favor and be so kind to post instead "collision avoidance is not guaranteed". Thanks in advance.

P.P.S: Giving the above, perhaps the "move" field should keep a not valid move value, for instance where from square equals destination square.
The probability is quite small but the point is that it could in fact produce an error. Of course the current polyglot format has the same issue with EVERY record, it's possible that a collision will occur and the move will also be valid. So what is so special about key zero in this respect?

The best solution of course is to make sure the move field is zero too or some other invalid move.
Capital punishment would be more effective as a preventive measure if it were administered prior to the crime.