Polyglot keys for chess variants.

Discussion of chess software programming and technical issues.

Moderator: Ras

Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Polyglot keys for chess variants.

Post by Michel »

Wouldn't it be better to separate major and minor version number by a period, rather than a linefeed (i.e. make it one field)?
Then it would be immediately obvious for someone printing the file as text that he is dealing with version numbers.
Yes perhaps it would be indeed more readable that way. The reason I split the version number into two fields is that

2.12 > 2.2

but this is not apparent from the alphabetical ordering. But it would of course be trivial for the application to split the version number in a major and a minor part.
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Polyglot keys for chess variants.

Post by Michel »

Here is rudimentary sample code that adds, displays and deletes headers in polyglot books (under the BSD license)

http://hardy.uhasselt.be/pgheader.c

Needless to say that I decline any responsibility for data loss.

I did some tests with the book dump utitility of polyglot and the files with and without header appear to behave identically.

Code: Select all

$ ./pgheader --help

pgheader add <options>
Add a header
Options:
-i, --in <file>             input file
-o, --out <file>            output file (default: book.bin)
-v, --variants <variants>   comma separated list of supported variants (default: normal)
-c, --comment <comment>     free format string, may contain newlines encoded as two char strings \n
                            (default: "")
    --version <version>     header format version (default: 1.0)

pgheader show <options>
Print the header
Options:
-i, --in <file>             input file

pgheader delete <options>
Delete the header
Options:
-i, --in <file>             input file
-o, --out <file>            output file (default: book.bin)
Here is an example session:

Code: Select all

$ ./pgheader add -i performance.bin  -o performance_.bin -c "performance.bin by Marc Lacrosse." 
$ ./pgheader show -i performance_.bin 
@PG@
1.0
1
normal
performance.bin by Marc Lacrosse.
$ ./pgheader delete -i performance_.bin  -o performance__.bin
$ ./pgheader show -i performance__.bin
$
(no output since there is no header)
Here is an asci dump of the first few bytes of performance_.bin.

Code: Select all

0000000 nul nul nul nul nul nul nul nul   @   P   G   @  nl   1   .   0
0000020 nul nul nul nul nul nul nul nul  nl   1  nl   n   o   r   m   a
0000040 nul nul nul nul nul nul nul nul   l  nl   p   e   r   f   o   r
0000060 nul nul nul nul nul nul nul nul   m   a   n   c   e   .   b   i
0000100 nul nul nul nul nul nul nul nul   n  sp   b   y  sp   M   a   r
0000120 nul nul nul nul nul nul nul nul   c  sp   L   a   c   r   o   s
0000140 nul nul nul nul nul nul nul nul   s   e   . nul nul nul nul nul
0000160 nul nul syn  vt del   K can   h  so   9 nul enq nul nul nul nul
0000200 nul nul   Z   H  em   u etx   P  si   t nul   u nul nul nul nul
0000220 nul nul   Z   H  em   u etx   P  so   , nul  so nul nul nul nul
Michel
Posts: 2292
Joined: Mon Sep 29, 2008 1:50 am

Re: Polyglot keys for chess variants.

Post by Michel »

I have been experimenting with magic. The following makes the proposed polyglot header recognizable by libmagic. Of course you need to be
root for this. On Ubuntu you can become root by doing "sudo su -".

Code: Select all

# cd /usr/share/file
Open the file "magic" in a text editor and add the following at the end

Code: Select all

#------------------------------------------------------------------------------
# polyglot:  file(1) magic polyglot chess opening book files
#
# From Michel Van den Bergh <michel.vandenbergh@uhasselt.be>

0       string          \x00\x00\x00\x00\x00\x00\x00\x00@PG@\x0a           Polyglot chess opening book
!:mime  application/polyglot
Next compile the new magic file by

Code: Select all

# file -C -m magic
That's it!

Example (using the file performance_.bin created in the previous post):

Code: Select all

$ file performance_.bin
performance_.bin: Polyglot chess opening book
$ file --mime-type performance_.bin 
performance_.bin: application/polyglot
User avatar
gcramer
Posts: 40
Joined: Mon Oct 28, 2013 11:21 pm
Location: Bad Homburg, Germany

Re: Polyglot keys for chess variants.

Post by gcramer »

I'm developing Scidb (http://scidb.sourceforge.net), a chess database application which supports some chess variants: AntiChess (Suicide, Giveaway), Losers, Three-check Chess, Crazyhouse, and Bughouse (the latter is still in development). Now I've made the polyglot book support. In my code I do not agree with the suggestions from the XBoard author due to some issues.

At first: XBoard is starting the Zobrist key computation with different start values depending on the chess variant. This is unnecessarily complicated, because in practice the opening book cannot be mixed with different chess variant, though it is theoretically possible. So I do not use different start values. Furthermore it's a better idea to denote the chess variant of the book inside the polyglot header.

But firstly a note about the method how Scidb is decoding the castle rook in Chess960, if the castle rook is an inner rook - XBoard has not yet implemented this. Scidb's implementation complies with the recommendation of the Polyglot book format (see http://hardy.uhasselt.be/Toga/book_format.html).

We extend the Random table:

Code: Select all

static uint64_t const Random64[793] =
{
   ...
   // extension for Chess960 castle with inner rook
   U64(0x24C7AE8ACD3AD30D), U64(0xA3CBDA8EB6CD6D17), 64(0x92D9EE42CBFA4F52), U64(0x438F893E666FFA6B),
   U64(0x3CCC5D24464F62B1), U64(0xCFA9275812D9E5DB), U64(0xBBF17284897C0F48), U64(0xA077F2F8F190127B),
   U64(0x5FDEEC8850464C2B), U64(0x9429D7F4C15885F4), U64(0x4DCAC2831E23E842), U64(0x4E109F15C27A18BF),
};

static uint64_t const* const RandomInnerRook = Random64 + 781;
The computation for the castle key:

Code: Select all

Rights castling = board.castlingRights();

if (castling & WhiteKingside)
{
    if (board.isInnerCastleRook(WhiteKS))
        key ^= RandomInnerRook[fyle(board.shortCastleRook(White)) - 1];
    else
        key ^= RandomCastle[0];
}
if (castling & WhiteQueenside)
{
    if (board.isInnerCastleRook(WhiteQS))
        key ^= RandomInnerRook[fyle(board.longCastleRook(White)) - 1];
    else
        key ^= RandomCastle[1];
}
if (castling & BlackKingside)
{
    if (board.isInnerCastleRook(BlackKS))
        key ^= RandomInnerRook[fyle(board.shortCastleRook(Black)) + 5];
    else
        key ^= RandomCastle[2];
}
if (castling & BlackQueenside)
{
    if (board.isInnerCastleRook(BlackQS))
        key ^= RandomInnerRook[fyle(board.longCastleRook(Black)) + 5];
    else
        key ^= RandomCastle[3];
}
Some issues with the Zhouse decoding in XBoard: the use of the plus operator in the Zobrist key computation is not the way how this keys will be generated, the chance is high that in this way key collisions will occur.

This is the method how Scidb is generating the key:

Extend the Random table a bit:

Code: Select all

static uint64_t const Random64[803] =
{
   ...
   // extension for Chess960 castle with inner rook
   ...
   // extension for holding (in the usual way: BP,WP,BN,WN,BB,WB,BR,WR,BQ,WQ)
   U64(0x1EC62F17201666A9), U64(0xBBFACFA8B9CEDC99), U64(0xD52582CA4006E48D), U64(0xBE5CC29389B0A011),
   U64(0x70B7B299FA084B79), U64(0x31FC91D4B888AAC6), U64(0x953D8D65F16A27F5), U64(0xA9E1894A083988A6),
   U64(0x21454DE44A9C3B61), U64(0x5F8B1149ED761B1D),
};

static uint64_t const* const RandomHolding = Random64 + 793;
We use a very fast function for bit rotation:

Code: Select all

uint64_t
rotate_left(uint64_t x, unsigned shift)
{
   // NOTE:
   // Due to <http://chsc.wordpress.com/2010/01/13/compiler-optimization>
   // the GNU compiler knows that the C code only rotates the bits and that
   // this can be done with the x86 rol and ror instructions.
   // Also see <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17886>.
   return (x << shift) | (x >> (sizeof(uint64_t)*8 - shift));
}
Now the holding can be easily calculated with:

Code: Select all

if (variant::isZhouse(variant))
{
   material::Count white = board.holding(color::White);

   if (white.pawn)   key ^= rotate_left(RandomHolding[1], white.pawn   - 1);
   if (white.knight) key ^= rotate_left(RandomHolding[3], white.knight - 1);
   if (white.bishop) key ^= rotate_left(RandomHolding[5], white.bishop - 1);
   if (white.rook)   key ^= rotate_left(RandomHolding[7], white.rook   - 1);
   if (white.queen)  key ^= rotate_left(RandomHolding[9], white.queen  - 1);

   material::Count black = board.holding(color::Black);

   if (black.pawn)   key ^= rotate_left(RandomHolding[0], black.pawn   - 1);
   if (black.knight) key ^= rotate_left(RandomHolding[2], black.knight - 1);
   if (black.bishop) key ^= rotate_left(RandomHolding[4], black.bishop - 1);
   if (black.rook)   key ^= rotate_left(RandomHolding[6], black.rook   - 1);
   if (black.queen)  key ^= rotate_left(RandomHolding[8], black.queen  - 1);
}
This method is solving the problem with the encoding of the piece count in holding; in the usual way how a Zobrist key will be generated.

The only critical thing is, that any linear combination of this computation should not collide with exisiting Zobrist numbers; but this is proofed (see https://sourceforge.net/p/scidb/code/HE ... t_book.cpp, function checkHashKeys()).

Also the move encoding for dropped pieces is too complicated in XBoard (and a kludge due to the commentary of the author). In Scidb the from-field will be set equal to the to-field, and the dropped piece is decoded as a promoted piece. Decoding is very easy, because a move where from-field equals to to-field must be a dropping move.

Another issue occurs in the computation of the Zobrist key for Three-check Chess positions. The XBoard algorithm has forgotten to hash the number of given checks for each side. Example: a position where White has given one check in general never matches a position where White has already given two checks; the strategy in a position is significantly depending on the number of given checks.

This is how Scidb is doing the computation:

A third extension to the Random table:

Code: Select all

static uint64_t const Random64[809] =
{
   ...
   // extension for Chess960 castle with inner rook
   ...
   // extension for holding (in the usual way: BP,WP,BN,WN,BB,WB,BR,WR,BQ,WQ)
   ...
   // extension for 3check
   U64(0x1E8860E4E8AFCFFC), // one check given to black
   U64(0x107FB5302E49B653), // two checks given to black
   U64(0x453CA2EB419DE2C5), // three check given to black
   U64(0xA3094E8EB1649123), // one check given to white
   U64(0xE850AC9A440DD0AF), // two checks given to white
   U64(0xE6EB2D90ADE835B6), // three checks given to white
};

static uint64_t const* const RandomThreeCheck = Random64 + 803;
This is the Zobrist key computation:

Code: Select all

if (variant == variant::ThreeCheck)
{
   if (board.checksGiven(color::White))
      key ^= RandomThreeCheck[(board.checksGiven(color::White) - 1) & 0x3];
   if (board.checksGiven(color::Black))
      key ^= RandomThreeCheck[(board.checksGiven(color::Black) - 1) & 0x3];
}
It seems to me that XBoard has a problem with AntiChess (Suicide, Giveaway). I cannot see how the promotion to a King will be encoded. Scidb is doing this by encoding the promoted King with value 5.

Final note: This is a draft, the book implementation is not yet tested.
User avatar
hgm
Posts: 28461
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Polyglot keys for chess variants.

Post by hgm »

gcramer wrote:At first: XBoard is starting the Zobrist key computation with different start values depending on the chess variant. This is unnecessarily complicated, because in practice the opening book cannot be mixed with different chess variant, though it is theoretically possible.
I wouldn't call adding (or XORing) a constant to a key 'complicated', which makes it a rather moot point whether it is necessary or not. Like you say, doing so in theory enables you to merge books for different variants. As there are engines (such as Fairy-Max or Pulsar) that can dynamically switch variant, and there currently is no mechanism in WinBoard that would automatically switch book file with that, I consider that a big advantage, for which one almost pays no price at all. So I will definitely keep it like that in WinBoard. In fact I might extend the standard for this to have the variant ID depend in a specified way on the board and holdings size, in case these are non-standard. (So that I can make a merged book for Shogi, mini-Shogi and Judkins Shogi for Shokidoki.)
So I do not use different start values. Furthermore it's a better idea to denote the chess variant of the book inside the polyglot header.
Too bad. It means it will not be possible to use or edit your books with WinBoard.
But firstly a note about the method how Scidb is decoding the castle rook in Chess960, if the castle rook is an inner rook - XBoard has not yet implemented this. Scidb's implementation complies with the recommendation of the Polyglot book format (see http://hardy.uhasselt.be/Toga/book_format.html).

We extend the Random table:

Code: Select all

static uint64_t const Random64[793] =
{
   ...
   // extension for Chess960 castle with inner rook
   U64(0x24C7AE8ACD3AD30D), U64(0xA3CBDA8EB6CD6D17), 64(0x92D9EE42CBFA4F52), U64(0x438F893E666FFA6B),
   U64(0x3CCC5D24464F62B1), U64(0xCFA9275812D9E5DB), U64(0xBBF17284897C0F48), U64(0xA077F2F8F190127B),
   U64(0x5FDEEC8850464C2B), U64(0x9429D7F4C15885F4), U64(0x4DCAC2831E23E842), U64(0x4E109F15C27A18BF),
};

static uint64_t const* const RandomInnerRook = Random64 + 781;
The computation for the castle key:

Code: Select all

Rights castling = board.castlingRights();

if (castling & WhiteKingside)
{
    if (board.isInnerCastleRook(WhiteKS))
        key ^= RandomInnerRook[fyle(board.shortCastleRook(White)) - 1];
    else
        key ^= RandomCastle[0];
}
if (castling & WhiteQueenside)
{
    if (board.isInnerCastleRook(WhiteQS))
        key ^= RandomInnerRook[fyle(board.longCastleRook(White)) - 1];
    else
        key ^= RandomCastle[1];
}
if (castling & BlackKingside)
{
    if (board.isInnerCastleRook(BlackKS))
        key ^= RandomInnerRook[fyle(board.shortCastleRook(Black)) + 5];
    else
        key ^= RandomCastle[2];
}
if (castling & BlackQueenside)
{
    if (board.isInnerCastleRook(BlackQS))
        key ^= RandomInnerRook[fyle(board.longCastleRook(Black)) + 5];
    else
        key ^= RandomCastle[3];
}
I will see if I can implement this in WinBoard as well.

Some issues with the Zhouse decoding in XBoard: the use of the plus operator in the Zobrist key computation is not the way how this keys will be generated, the chance is high that in this way key collisions will occur.
This seems complete nonsense to me. Most of my engines use additive keys, and I have never seen any problems with this as far as collision rates are concerned. Do you have any evidence at all why this would be so?
This is the method how Scidb is generating the key:

Extend the Random table a bit:

Code: Select all

static uint64_t const Random64[803] =
{
   ...
   // extension for Chess960 castle with inner rook
   ...
   // extension for holding (in the usual way: BP,WP,BN,WN,BB,WB,BR,WR,BQ,WQ)
   U64(0x1EC62F17201666A9), U64(0xBBFACFA8B9CEDC99), U64(0xD52582CA4006E48D), U64(0xBE5CC29389B0A011),
   U64(0x70B7B299FA084B79), U64(0x31FC91D4B888AAC6), U64(0x953D8D65F16A27F5), U64(0xA9E1894A083988A6),
   U64(0x21454DE44A9C3B61), U64(0x5F8B1149ED761B1D),
};

static uint64_t const* const RandomHolding = Random64 + 793;
We use a very fast function for bit rotation:

Code: Select all

uint64_t
rotate_left(uint64_t x, unsigned shift)
{
   // NOTE:
   // Due to <http://chsc.wordpress.com/2010/01/13/compiler-optimization>
   // the GNU compiler knows that the C code only rotates the bits and that
   // this can be done with the x86 rol and ror instructions.
   // Also see <http://gcc.gnu.org/bugzilla/show_bug.cgi?id=17886>.
   return (x << shift) | (x >> (sizeof(uint64_t)*8 - shift));
}
Now the holding can be easily calculated with:

Code: Select all

if (variant::isZhouse(variant))
{
   material::Count white = board.holding(color::White);

   if (white.pawn)   key ^= rotate_left(RandomHolding[1], white.pawn   - 1);
   if (white.knight) key ^= rotate_left(RandomHolding[3], white.knight - 1);
   if (white.bishop) key ^= rotate_left(RandomHolding[5], white.bishop - 1);
   if (white.rook)   key ^= rotate_left(RandomHolding[7], white.rook   - 1);
   if (white.queen)  key ^= rotate_left(RandomHolding[9], white.queen  - 1);

   material::Count black = board.holding(color::Black);

   if (black.pawn)   key ^= rotate_left(RandomHolding[0], black.pawn   - 1);
   if (black.knight) key ^= rotate_left(RandomHolding[2], black.knight - 1);
   if (black.bishop) key ^= rotate_left(RandomHolding[4], black.bishop - 1);
   if (black.rook)   key ^= rotate_left(RandomHolding[6], black.rook   - 1);
   if (black.queen)  key ^= rotate_left(RandomHolding[8], black.queen  - 1);
}
This method is solving the problem with the encoding of the piece count in holding; in the usual way how a Zobrist key will be generated.

The only critical thing is, that any linear combination of this computation should not collide with exisiting Zobrist numbers; but this is proofed (see https://sourceforge.net/p/scidb/code/HE ... t_book.cpp, function checkHashKeys()).
Unlike with addition, I have actually had quite poor results when testing schemes that generated XOR keys by rotating a smaller number of keys by only a single bit. Rotating them by 8 bits was OK, but making the keys by rotating the same base key in steps of one bit gave an enormous increase in the number of collissions (orders of magnitude). So your method seems suspect. Have you studied its effect on collission rate, and if so, how exactly? That you don't collide with any single key from the board table doesn't exclude at all you have 3-key or 4-key dependencies.
Also the move encoding for dropped pieces is too complicated in XBoard (and a kludge due to the commentary of the author). In Scidb the from-field will be set equal to the to-field, and the dropped piece is decoded as a promoted piece. Decoding is very easy, because a move where from-field equals to to-field must be a dropping move.
This seems to put an unnecessarily heavy a restriction on the number of pieces that can be dropped. Even on an 8x8 board there are only 4 bits left to indicate piece type in the 16-bit move field. E.g. Wa Shogi has an 11x11 board, it would only leave room for 4 piece types, while Wa has many more (and is usually played with drops). The XBoard method has no problems, however: by setting one of the 4 available piece codes aside to indicate drops, you could handle as many piece types as there are board squares by encoding it in the from-square. So I consider the method you propose inferior. (Despite the fact that it uses only codes that otherwise would be unused.)
Another issue occurs in the computation of the Zobrist key for Three-check Chess positions. The XBoard algorithm has forgotten to hash the number of given checks for each side. Example: a position where White has given one check in general never matches a position where White has already given two checks; the strategy in a position is significantly depending on the number of given checks.
This is not so much an oversight, as a result from the fact that XBoard doesn't really keep track of the number of checks at all. So the basic information to encode is not even available, which is why I never saw any need to define an encoding for it.
This is how Scidb is doing the computation:

A third extension to the Random table:

Code: Select all

static uint64_t const Random64[809] =
{
   ...
   // extension for Chess960 castle with inner rook
   ...
   // extension for holding (in the usual way: BP,WP,BN,WN,BB,WB,BR,WR,BQ,WQ)
   ...
   // extension for 3check
   U64(0x1E8860E4E8AFCFFC), // one check given to black
   U64(0x107FB5302E49B653), // two checks given to black
   U64(0x453CA2EB419DE2C5), // three check given to black
   U64(0xA3094E8EB1649123), // one check given to white
   U64(0xE850AC9A440DD0AF), // two checks given to white
   U64(0xE6EB2D90ADE835B6), // three checks given to white
};

static uint64_t const* const RandomThreeCheck = Random64 + 803;
This is the Zobrist key computation:

Code: Select all

if (variant == variant::ThreeCheck)
{
   if (board.checksGiven(color::White))
      key ^= RandomThreeCheck[(board.checksGiven(color::White) - 1) & 0x3];
   if (board.checksGiven(color::Black))
      key ^= RandomThreeCheck[(board.checksGiven(color::Black) - 1) & 0x3];
}
It seems to me that XBoard has a problem with AntiChess (Suicide, Giveaway). I cannot see how the promotion to a King will be encoded. Scidb is doing this by encoding the promoted King with value 5.
This is an oversight, but not a problematic one: I have already given up hope that the encoding of promotion piece can be done in a variant-independent way. It is a very scarce resource (only 2^16 / boardSize^2 possibilities), and reserving codes for pieces that do not actually occur in the variant thus really hurts. Especially since the variants with the bigger boards also tend to have more piece types. E.g. in Grand Chess the board is 10x10, leaving room only for the encoding of 6.5 promotion types (one of them needed for non-promotion). While there are 6 piece types to which you could promote. Polyglot format really does a bad job in this respect. But the excuse of course is that promotions are not very common in the opening.

XBoard currently uses code 5 for Archbishop, (or rather, promotion character 'a', whatever that means), but as far as I am concerned there is no reason that it could not be used for king in Anti-Chess. But of course that would give a problem for those who would want to play Anti-Capablanca and make books for that. So it would be more logical to use code 7, which is currently only used in Shogi to represent the = suffix for deferral. (Which actually would not really need encoding at all, as it is just a non-promotion, and could thus get code 0.) With Capablanca's 10x8 board there is room for 10 promotion codes, enough to have non-promotion, 6 piece types (incl A and C), one King and one code for drops.

If it was really up to me, however, I would encode promotions similar to the way XBoard now encodes drops: use what is now the promotion-piece code to encode the move step (only 3 possibilities for FIDE Pawns: straight or (2x) diagonally ahead), so that together with the from-square it implies the to-square, and the latter can be used to indicate the piece type. In fact that seems like over-doing it. The best encoding would probably be to leave a single bit to indicate 'special move' (like drop, promotion, castling, multi-leg), so that 15 bits can be used to indicate from- and to-square, i.e. up to 179 squares per board. The encoding when the 'special' bit is set can then be variant dependent, to optimally satisfy the needs of the variant. E.g. in Chu Shogi it could simply indicate promotion on any move of a non-promoted piece type, and encode in the to-square the 64 different two-leg move steps that are possible on already promoted pieces. For Chess-like promotion choice the to-square code could encode both move step (3 possibilities) and promotion piece (21 possibilities left in Chess). It still would not solve the problem for Big Chess (16x16 board), though...
User avatar
gcramer
Posts: 40
Joined: Mon Oct 28, 2013 11:21 pm
Location: Bad Homburg, Germany

Re: Polyglot keys for chess variants.

Post by gcramer »

At first a note about the polyglot chess book. The standard is defined for 8x8 chess with normal pieces, and this standard shouldn't be affected by any extension. This means that bigger board sizes, and/or fairy pieces, requires a separate (this means: non-affecting) hash key computation. (In my opinion a 128 bit hash key is needed when bigger board sizes will be handled.) Scidb's hash key computation for chess variants is not affecting at all the computation for standard chess.

Furthermore a standard shouldn't be influenced by any technical issue, nobody will accept such a standard.
As there are engines (such as Fairy-Max or Pulsar) that can dynamically switch variant, and there currently is no mechanism in WinBoard that would automatically switch book file with that,
This is a technical issue, and must be solved in a technical way.
I wouldn't call adding (or XORing) a constant to a key 'complicated'
The start key is unnecessary, hence complicated. (I'm speaking about Crazyhouse, Antichess, and so on, on 8x8 boards, I'm not speaking about other board sizes or fairy chess.) Technical issues cannot influence a standard.
This seems complete nonsense to me. Most of my engines use additive keys...
The zobrist key computation is using xoring, because xoring has unpredictable results, this is the principle of chaos. The additive operation in a non-prime seeded modulo ring (2 isn't prime) results in general in relatively short cycles, this is not chaos. Using the additive operation seems to be mathematically improper.
and I have never seen any problems with this as far as collision rates are concerned.
Ok, but nevertheless this not the (usual) way how the key will be computed.
Unlike with addition, I have actually had quite poor results when testing schemes that generated XOR keys by rotating a smaller number of keys by only a single bit.
If this is the case, then of course it's better to use predefined keys. But my test with 1.291.522 Crazyhouse games, generating 57.490.796 different positions, hasn't detected any collision with Scidb's method.

Crazyhouse/Bughouse
> Also the move encoding for dropped pieces is too complicated in XBoard (and a kludge due to the commentary of the author). ...

This seems to put an unnecessarily heavy a restriction on the number of pieces that can be dropped.
The computation for normal chess pieces shouldn't be influenced by fairy chess computations, I think the latter will be done with a specialized computation for fairy chess.

Another issue with Crazyhouse/Bughouse, which is overseen in XBoard, and what I have forgotten in Scidb's first implementation. It is unavoidable to hash promoted pieces. If a promoted piece will be captured, for example a promoted Queen, not the Queen but a Pawn will be added to the holding. So capturing a promoted Queen is not the same as capturing a regular Queen. This must be regarded, because subsequent moves may depend on this fact.

This is Scidb's key computation for the board pieces, using generated keys for promoted pieces:

Code: Select all

uint64_t key    = 0;
uint64_t pieces = board.pieces();

while (pieces)
{
   Square      s = board::lsbClear(pieces);
   piece::Type p = board.piece(s);
   color::ID   c = board.color(s);

   uint64_t k = RandomPiece[::mul64(mstl::mul2(6 - p) + (c ^ 1)) + s];

   if (board.isPromotedPiece(s)) // can only happen in Zhouse
      k = ::mstl::bf::rotate_left(k, 32);

   key ^= k;
}
Three-check Chess
This is not so much an oversight, as a result from the fact that XBoard doesn't really keep track of the number of checks at all. So the basic information to encode is not even available, which is why I never saw any need to define an encoding for it.
The encoding of the number of checks is a must, a position cannot be equal if the number of checks is not equal in this variant. It's like the castle rights, a position cannot be equal if the castle rights are not equal, but in case of Three-check Chess the thing with the number of checks is even more necessary than the thing with the castle rights.

A simple example: 1.e4 e5 2.Nc3 Nf6 3.Bc4 Nc6 4.Bxf7+ Kxf7 5.Qh5+ Nxh5 6.Nf3 Ke8 7.Nd5 and White will win with a check in next move.

Hers is the FEN after 4...Kxf7:

Code: Select all

r1bq1b1r/pppp1kpp/2n2n2/4p3/4P3/2N5/PPPP1PPP/R1BQK1NR w KQ - 0 5 +1+0
+1+0 means; White has given one check to Black, Black has not given any check. Now consider the FEN

Code: Select all

r1bq1b1r/pppp1kpp/2n2n2/4p3/4P3/2N5/PPPP1PPP/R1BQK1NR w KQ - 0 5 +0+0
Neither side has given check. In this case 5.Qh5+ will loose, and not win like in the game. 5.Qh5 is a move in an early stage, a computation without regarding the number of checks may produce useless books.

Antichess
This is an oversight, but not a problematic one: I have already given up hope that the encoding of promotion piece can be done in a variant-independent way.
I agree, a variant-independent way is too problematic. The value 5 for the king is the canonical value in Antichess.
User avatar
hgm
Posts: 28461
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Polyglot keys for chess variants.

Post by hgm »

gcramer wrote:At first a note about the polyglot chess book. The standard is defined for 8x8 chess with normal pieces, and this standard shouldn't be affected by any extension. This means that bigger board sizes, and/or fairy pieces, requires a separate (this means: non-affecting) hash key computation. (In my opinion a 128 bit hash key is needed when bigger board sizes will be handled.) Scidb's hash key computation for chess variants is not affecting at all the computation for standard chess.
Neither is the current XBoard implementation affecting it...
Furthermore a standard shouldn't be influenced by any technical issue, nobody will accept such a standard.
Not sure what you are trying to say here. Is it that when a standard is perfect, except for the mere technical issue that it cannot be implemented due to unrealistic demands on available hardware, everyone would prefer to (not) use that standard, rather than adopt one that is realistic?
This is a technical issue, and must be solved in a technical way.
Well, adding a variant key to the position hash seems a nice technical solution to me...
The start key is unnecessary, hence complicated. (I'm speaking about Crazyhouse, Antichess, and so on, on 8x8 boards, I'm not speaking about other board sizes or fairy chess.) Technical issues cannot influence a standard.
Even the variants you mention would not be able to use each others books, as the same positions will totally differ in what the acceptable moves from that position are. For multi-variant books the variant key is necessary, and especially so for positions that only contain orthodox pieces on 8x8, as these are the positions that can be confused otherwise. (For Xiangqi, for instance, there never is any collision, as it uses another piece type for the always-present King.)

In my philosophy standards are there for no other reason than to solve technical problems, but ensuring that everybody uses the same way where there when arbitrary choices have to be made. Meaning that it better do a good job at solving them to be accepted by a meaningful number of people. I would never accept a standard that prevents me to do things that would technically be possible by violating that standard.

Specifying the variant in a header is part of the standard just as much as specifying the start key, so you seem to have no point. In any case, I want a standard that allows multi-variant books for XBoard, because I want to make such books. This is non-negociable. If you want to define your own, inferior standard for scidb, that is up to you. I won't ever use such a standard in XBoard, that is for sure. So discussing this serves no purpose.
The zobrist key computation is using xoring, because xoring has unpredictable results, this is the principle of chaos. The additive operation in a non-prime seeded modulo ring (2 isn't prime) results in general in relatively short cycles, this is not chaos. Using the additive operation seems to be mathematically improper.
You think 2 isn't prime? I guess then we are also done discussing math...
Ok, but nevertheless this not the (usual) way how the key will be computed.
Well, it is for me... But it seems pretty irrelevant what is usual, and what not. I only care about quality. Even if the whole world does something in an inferior way, it wouldn't be a reason for me to slavishly copy their mistakes. 'Usual' carries zero weight with me.
If this is the case, then of course it's better to use predefined keys. But my test with 1.291.522 Crazyhouse games, generating 57.490.796 different positions, hasn't detected any collision with Scidb's method.
Well, with so few positions even a lousy 64-bit key wouldn't give any collisions, of course. 57M ~ 2^26, so you have 2^51 pairs. With an optimal 64-bit key the chance for a collission is thus only 1/2^13. The collision rate could be a thousand times higher than for a good key, and you still would not see it.

Anyway, I only said the shifting method was suspect, based on actual measurements of collision rates (rather than just observing that there were no collissions under conditions where you would not expect any no matter what). In fact the addition and the rotation methods for holdings keys are barely different, as multiplying a key by 2 or 4 is equivalent to shifting 1 or two bits (which again is the same as rotating when the most-significant bits happened to be zero). So only the case of 3 identical pieces in hand would be different. (And for Pawns when you have 5-7 in hand. Ever seen that in a real game?)

It seems to me you are seeing ghosts here, and that the only justification for doing it different as XBoard does is just to be incompatible.

Crazyhouse/Bughouse
The computation for normal chess pieces shouldn't be influenced by fairy chess computations, I think the latter will be done with a specialized computation for fairy chess.
Well, Crazyhouse is fairy Chess, and in fact so is the FIDE game. The only difference is that they are played by a different group of fairies.
Another issue with Crazyhouse/Bughouse, which is overseen in XBoard, and what I have forgotten in Scidb's first implementation. It is unavoidable to hash promoted pieces. If a promoted piece will be captured, for example a promoted Queen, not the Queen but a Pawn will be added to the holding. So capturing a promoted Queen is not the same as capturing a regular Queen. This must be regarded, because subsequent moves may depend on this fact.
Well, XBoard fully takes account of that, right? Because the promoted Pawns are other piece types (displayed as modified pictograms) than the usual Chess piecs. So a Pawn promoted to Queen would use a different Zobrist key from the from a true Queen on the same square. Namely the key for the corresponding fairy piece.
The encoding of the number of checks is a must, a position cannot be equal if the number of checks is not equal in this variant. It's like the castle rights, a position cannot be equal if the castle rights are not equal, but in case of Three-check Chess the thing with the number of checks is even more necessary than the thing with the castle rights.
Yes, I understand all that. It just means that ThreeChecks isn't a 'fully supported' variant in XBoard, but only has 'partial support', and that one must rely on engine or ICS. In this case that would interfere not only with legality checking, but also with the use of a GUI book. (At least, in so far the book must contain positions where one of the sides has been checked, which seems about as useful to me as putting positions in an opening book where one side is a full minor ahead.) Note that before I started working on XBoard, it did not even keep track of castling rights in normal Chess. I just didn't consider ThreeChecks of sufficient importance to spend any time on its support at all.
+1+0 means; White has given one check to Black, Black has not given any check. Now consider the FEN

Code: Select all

r1bq1b1r/pppp1kpp/2n2n2/4p3/4P3/2N5/PPPP1PPP/R1BQK1NR w KQ - 0 5 +0+0
I don't like this FEN notation much; I greatly prefer the half-move and full-move counters should be the last two fields in the FEN. (Because often you are not interested in them, and then they can be easily omitted or ignored.) Where does it say that FENs for ThreeChecks should be encoded this way?

It also seems a bad idea to count the number of checks given rather than the number of checks left. It would mean that in the variant FourChecks the game-theoretically equivalent position in ThreeChecks would be encoded by a different FEN (and hence, that the meaning of a FEN cannot be deduced without knowing whether it is for ThreeChecks or for FourChecks. Counting the number of checks left would not have that problem. I also don't like the use of two + signs. One of those is redundant. A field 2+2 in the FEN would already unambiguously specify both sides can still survive two checks (and would thus lose on the third). The idea of FENs is that they should be compact. The format you use here seems ill-thought, and would not qualify for use in XBoard.
I agree, a variant-independent way is too problematic. The value 5 for the king is the canonical value in Antichess.
Canonical? Do you have a reference for that? Or does 'canonical' here merely mean 'the value that you like best'? (Sorry that I am such a skeptic, but this discussion so far makes me fear for the worst...)

It is not too late to redefine the book format XBoard uses for shady variants like ThreeChecks or Suicide; I don't believe there already exist any XBoard books for it. And the move field (unlike the hash key) can be easily adapted in existing books anyway. But I certainly would not consider any change in the wrong direction, which would decrease the capabilities compared to the current format (such as making multi-variant books) in any way. In fact I would not even consider change just for the sake of change, without there being a compelling reason. And '5 is the canonical value' or 'this is the usual way' do not sound very compelling to me. There must be a clearly identifiable thing that can be done after the change, and could not be done before it.
User avatar
gcramer
Posts: 40
Joined: Mon Oct 28, 2013 11:21 pm
Location: Bad Homburg, Germany

Re: Polyglot keys for chess variants.

Post by gcramer »

> Furthermore a standard shouldn't be influenced by any technical issue, nobody will accept such a standard.

Not sure what you are trying to say here.
1. You said that some chess programs cannot switch the polyglot book. In Scidb the polyglot book of the chess programs can be switched, so I don't understand. I do not know the problematic in XBoard. The constraint that the book cannot be switched is technically (implementation/application dependent), the polyglot book standard is formally (implementation/application independent).

2. A developer of a chess application for Antichess is not interested in any complex computation influenced by other variants, or derived from technical constraints by other applications. So the Antichess encoding has to be as simple as possible for Antichess, the Crazyhouse/Bughouse encoding has to be as simple as possible for Crazyhouse/Bughouse, and so on.

3. A developer of 8x8 chess applications don't like to handle complicated formats of bigger boards. Bigger boards have to go his own way.

4. A developer of standard chess applications (K, Q, R, B, N, P) don't likes to handle complex formats influenced by fairy pieces.

5. Any application has to handle his own technical things, Scidb too. In fact the whole polyglot standard is not sufficient for Scidb, so I'm adding extensions, but polyglot is very common, it seems not to be appropriate for me to use an incompatible book standard. I'm adding my extensions without being incompatible.
I want a standard that allows multi-variant books for XBoard, because I want to make such books.
Of course a general standard shouldn't cause handicaps, but a standard will be as simple as possible for everyone.

A header enumerating all used variants should be useful, in this way any application can recognize the capability of the book. This seems to be a good point to integrate a start key for each variant.

1. This is an easy way for applications never using mixed books.

2. The generator of the book knows the start keys, this application can ignore the header of his own books, with other words, it's an easy way even for applications with complex books.

3. The books are interchangeable, any protocol for the interchangeability is unavoidable.
But it seems pretty irrelevant what is usual, and what not.
Of course. So I have to be clearer:
Well, with so few positions even a lousy 64-bit key wouldn't give any collisions, of course. 57M ~ 2^26, so you have 2^51 pairs. With an optimal 64-bit key the chance for a collission is thus only 1/2^13. The collision rate could be a thousand times higher than for a good key, and you still would not see it.
Yes, it's impossible to show a correctness in this way, what I did is to show that practically it doesn't have problems. But I will change the computation to predefined keys, in this way any doubt should be removed (although I cannot understand the arguments: even with shifted values the result with XOR is unpredictable, probably the dispersion is not so good?).
Most of my engines use additive keys, and I have never seen any problems with this as far as collision rates are concerned.
Yes, you can show that the addition can be used in practice, but it's also impossible to show the correctness in this way. What I know about modulo rings is not a pro for this method, so I cannot understand this method. I can understand the usual, and well tested XOR, and I don't like to use methods which I cannot understand. Xoring is so easy and elegant, and well-proofed in practice, and well accepted. By the way: Adler-32 is using addition and modulo, and it's known that this arithmetic is generating many collisions; the Adler-32 checksum computation isn't popular anymore, nowadays CRC32, or other methods, will be used.
You think 2 isn't prime?
Oops, that's funny, obviously I've meant 2^64, it's Z/(2^64)Z.
Well, XBoard fully takes account of that, right? Because the promoted Pawns are other piece types (displayed as modified pictograms) than the usual Chess piecs.
Ok, I couldn't see this in your implementation, it isn't obvious; then it's not an issue. This means that you are using a computed key in a similar way.
I don't like this FEN notation much; I greatly prefer the half-move and full-move counters should be the last two fields in the FEN.
The advantage of this FEN is that it fit's with the standard FEN, except the additional part. I tried before another format:

RNBQKBNR/.../rnbqkbnr/2x2 ...

but I changed this. The problem with this kind of FEN is that other applications might have big problems with an additional field inside the standard fields. An additional field normally will be skipped.
It also seems a bad idea to count the number of checks given rather than the number of checks left.
It's not a bad idea, but your idea has a big advantage.
Canonical? Do you have a reference for that?
Yes, http://hardy.uhasselt.be/Toga/book_format.html: None=0, N=1, B=2, R=3, Q=4, and so K=5 is canonical (for Antichess), all enumerations in current standard are consecutive, and this is what I mean with: an Antichess developer is not interested in special conditions caused by other variants, a value 7 would make a linear mapping of the pieces from own values to standard values unnecessarily complicated for him.
It seems to me you are seeing ghosts here, and that the only justification for doing it different as XBoard does is just to be incompatible.
XBoard's current computation is good for XBoard, but not a model for a common way. And I don't know yet whether a model for a common way will ever exist.

By the way: in fact I'm not interested to define a standard. Scidb is fully independent from any complex standard, even the ChessBase de facto standard doesn't care me. If the world is deciding that XBoard's computation is standard, then it's ok for me. In Scidb I do what I like - as you do in XBoard - I'm developing software since more than 30 years, I know what I'm doing, and I do all these things just for fun. But it may happen that other developers will decide that Scidb is defining the standard, so I will give a chance to influence my developing, because other applications aren't indifferent for me. But statements like "seems complete nonsense...then we are also done discussing math...seems ill-thought...seeing ghosts here..." are absolutely useless.
User avatar
hgm
Posts: 28461
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Polyglot keys for chess variants.

Post by hgm »

Well, when I say you are seeing ghosts it is not meant as an ad hominem. It just means that I think the problem you quote as a reason for doing things in a certain way it totally invalid / non-existent. For instance the modulo problem:

If Z is a Zobrist key that is odd N*Z modulo 2^64 cannot be zero unless N is a multiple of 2^64, right? So you have to try 2^64 values, which happens to be all possible key values. There are no short dependency loops. They only would be there if you use Zobrist keys that contain many factors two. And even if they would contain 58 factors 2, you still would have to multiply by 64 before you hit zero (or the same value as you started with). Rotating the keys OTOH, is guaranteed to reproduce the same value after 64 steps. So in any case the rotation method could not be used in a game where you could have 64 pieces of the same type in hand. Predefined tabulated keys are likely to make this even worse. Addition is the most flexible method.

I also don't understand why keys generated by rotations performed so poorly. It was just an empirical fact. I tried many different methods for constructing keys, and measured their collision rates. (Because in some games the number of required keys is so large that tabulating them is not practical in an engine, where performance is an issue. E.g. in Chu Shogi there are 72 colored piece types, and 144 board squares, which with 64-bit keys would already grossly overflow the L1 cache. So run-time generation of keys, e.g. by rotating a piece-type key by the square number, would give an enormous speedup over tabulating all the rotated values.)

Using tabulated keys was an awful idea to start with (but alas, we are now stuck with it). Life would have been infinitely more easy if the keys were simply defined as the output of some simple but known PRNG. Generating the bytes of the key by taking the highest eight bits of the powers of some magic constant would have been sufficient for this application, and the constant could have been chosen such that it would have a cycle of a billion. But there is no reason to repeat past mistakes, by defining more and more arbitrary key values. I strongly plead for any keys to be added to the original Polyglot standard being derivable by some simple formula that would in practice never exhaust.

About the FEN, is this a realistic concern or just a hypothetical one? Which other application do you have in mind that you would like to feed ThreeChecks FENs to even though they don't support ThreeChecks? If an appliation is liberal, it would not make a fuss if a FEN cuts off early. If it is pedantic, it would complain that there is stuff after the full-move counter anyway.

It seems to me that you are guilty here to an even higher degree to having a technical problem dictate an awkward standard.
User avatar
gcramer
Posts: 40
Joined: Mon Oct 28, 2013 11:21 pm
Location: Bad Homburg, Germany

Re: Polyglot keys for chess variants.

Post by gcramer »

Thanks for the mathematical explanation. Now I need my time to understand this in depth.
It seems to me that you are guilty here to an even higher degree to having a technical problem dictate an awkward standard.
I cannot understand anything, how can I dictate something? Everybody is doing what he likes. And I have indeed absolutely no interest to define standards, that's not my problem, I don't care. And I do what I'm always doing, I have fun developing Scidb, that's all.