The move in Polyglot books is encoded in 16 bits, as
toSqr + 64*fromSqr + 64*64*promotionType
which leaves room for 16 promotionTypes, of which only the first 5 are used (none, N, B, R, Q). This can be easily generalized to larger boards with N squares, by replacing 64 by N. This goes at the expense of the number of promotionTypes: on a 10x10 board only 6 choices would be available, on 12x12 only 3 before you overflow the 16 bits. On 16x16 only a single promotionType can be encoded (presumably meaning 'no promotion'), and beyond that the system does not work at all.
Now the lack of promotionTypes is not so disastrous as it may look. After all, this format is intended to encode opening moves, and promotions occur in opening lines very rarely. You could just leave positions from which a promotion move is playable out of the book. And in Chess promotions are mandatory, so you could use the 'none' code for promotion to the default piece (presumably Q) if the official code for that is not available. So then the only un-encodable moves are under-promotions.
Of course boards larger than 16x16 are a much more serious problem. One could only allow a sub-set of the squares to be encoded, depending on who has the move (e.g. only 1st to 6th rank for white). The large majority of opening moves would not stray out of that area. But this is not a very satisfactory solution.
I now designed a very compact encoding system, for use as an alternative for large boards (e.g. > 144 squares). It is based on mentioning the piece number in stead of the from-square number (counted the same way, namely in board-scan order), and indicating the move by encoding the ray direction and distance along the ray, rather than mentioning the to-square. For the more common oblique (= off-ray) moves, such as Knight jumps, some extra distance codes beyond the board length are added. For now I think that this is only needed for N, (1,3) and (2,3) jumps, which each need two codes, one for the move that goes in the general direction of the ray, (slightly to the right of it), the other diametrically opposite to it.
After that more 'pseudo-distances' can follow to indicate special moves (i.e. with side effects that are not implied by just the from- and to-square), with a variant-dependent meaning. E.g. in Chess-like variants this could be promotion moves (implying one step forward along the mentioned ray) to various piece types. In large Shogi variants they could specify various kinds of double-moves, which start off along the ray, and then contiue with a step specified by the code. I wanted to reserve 24 codes for this, so that the total number of distance code is maximum board dimension + 29.
For Shogi variants in general all pieces can optionally promote, so it must be possible to tag promotion on any move, not just on steps of one. So there the entire range of distance code is doubled, the second series indicating the move went accompanied by promotion.
Some pieces can do moves not caught by this system, or indeed move anywhere. Drop moves also fall in that category of 'universal leaps'. In this case there is no escaping to mention the full to-square number. For this reason the lowest 20*boardSize codes of the system are reserved for indicating these universal leaps, the (pieceNr, ray, distanceCode) system starting only after that. The 20 universal slots are assigned in a variant-dependent way to either piece types (for dropping them), or to pieces from a certain sub-set of types that are on the board (in scan order).
Polyglot-format books: large boards
Moderator: Ras
-
hgm
- Posts: 28502
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
-
stegemma
- Posts: 859
- Joined: Mon Aug 10, 2009 10:05 pm
- Location: Italy
- Full name: Stefano Gemma
Re: Polyglot-format books: large boards
If you use piece index why not using move index too? If you just agree on numbering moves of a piece from bottom to up and from left to right (for sample) you can save any from-to pair as piece-move pair then you can have free bits to store promotion. For chess 8x8, this would lead to 4+5 bits for moves and 3 bits for promotion. Still, you can index promotions too, just adding them to move list, right after standard moves to the same squares. That way, you don't need dedicated bits for promotions.
Sample for a white pawn in e7 that can capture something in d8 with e8 f8 free:
index:move
0:d8Q 1:d8R 2:d8B... 6:e8Q 7:e8R...
If a piece can move with/without promotion, just add no-promotion moves first. Of course you can also add all the promotions moves to the end of the list.
Sample for a white pawn in e7 that can capture something in d8 with e8 f8 free:
index:move
0:d8Q 1:d8R 2:d8B... 6:e8Q 7:e8R...
If a piece can move with/without promotion, just add no-promotion moves first. Of course you can also add all the promotions moves to the end of the list.
-
hgm
- Posts: 28502
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Polyglot-format books: large boards
Because XBoard would not always be aware of the game rules. In the (pieceIndex, direction, distance) system the only information it needs is which side the pieces are on. I agree that adding variant-dependent special moves sort of weakens that case. But if the moves are Pawn promotions, a default assignment of the codes could be made from the knowledge of which piece types participate (and by which ID letter these are known). Which is information XBoard would have been configured to know even when it does not know how the various piece types can move. This same order (but now including Pawn) could be used for assigning the 20 'drop slots' to piece types.stegemma wrote:If you use piece index why not using move index too?
The really tricky case is when there are 'Universal Leapers' on the board that would need their moves to be encoded with the aid of the drop slots. The encoding/decoding logic would need to know which piece types need that. But perhaps they could be listed in the book header. The possible promotion choices could be listed there as well.
Perhaps the meaning of special moves other than promotion (the 24 pseudoDistance following those encoding obique leaps) can also be specified in the book header, to make the book completely self-contained. The way I want to use them is for Shogi pieces that make 2 or 3 king moves per turn. Such a multi-step move could be written as (0,1)-(1,1), interpreted as belonging to the vertical ray, moving one straight forward (capturing what was there) and then forward-right. This would imply the meaning for the other rays, e.g. on the horizontal ray it would mean (1,0)-(1,-1) (rotated 90-degree right), for the diagonal ray (1,1)-(1,0) ('rotated' 45 degrees), etc. The book header would then just have to contain a list of such strings in the header. Of course you could also define single-step moves that would not be in the standard system in this way, e.g. (2,5), implying (5,-2) for the horizontal ray, (5,2) for the diagonal and (2,-5) for the anti-diagonal.
[edit] Perhaps it would be clearer, compacter and more intuitive to just specify the squares the move visits starting from, say, f5. like f6g7 for the mentioned two-step move. The Chu-Shogi Lion double-moves would then be summarized in a header
f6f7
f6g7
f6g6
f6g5
f6f5
f6e5
f6e6
f6e7
f4f3
f4e3
f4e4
f4e5
f4f5
f4g5
f4g4
f4g3
If you just agree on numbering moves of a piece from bottom to up and from left to right (for sample) you can save any from-to pair as piece-move pair then you can have free bits to store promotion. For chess 8x8, this would lead to 4+5 bits for moves and 3 bits for promotion. Still, you can index promotions too, just adding them to move list, right after standard moves to the same squares. That way, you don't need dedicated bits for promotions.
Sample for a white pawn in e7 that can capture something in d8 with e8 f8 free:
index:move
0:d8Q 1:d8R 2:d8B... 6:e8Q 7:e8R...
If a piece can move with/without promotion, just add no-promotion moves first. Of course you can also add all the promotions moves to the end of the list.
-
Michel
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Polyglot-format books: large boards
Perhaps it is time to introduce a header for polyglot books? The format I was proposingPerhaps the meaning of special moves other than promotion (the 24 pseudoDistance following those encoding obique leaps) can also be specified in the book header, to make the book completely self-contained.
http://hardy.uhasselt.be/Toga/pgheader- ... eader.html
is probably over engineered. But that does not mean a header is a bad idea.
Ideas=science. Simplification=engineering.
Without ideas there is nothing to simplify.
Without ideas there is nothing to simplify.
-
stegemma
- Posts: 859
- Joined: Mon Aug 10, 2009 10:05 pm
- Location: Italy
- Full name: Stefano Gemma
Re: Polyglot-format books: large boards
Another possible solution, that is not retro-compatible, could be to store in the header all the possible moves that appears in the book and then just use the index to select the one that you need in any path of the book tree. It is a kind of compression, For sample, the header can contains:
0:Ke2 1:Ke3 2:Ke4 3:Ke5... 47:Pd2 48:Pd3 49:Pd4... 101:Pe8Q...
and the book itself:
[hash]1 [hash]4 [hash]35 [hash]47
The possible moves could be a lot but for a big book the compression achived could give you still a smaller file.
This method is compatible with any game, because you are not limited in the size of the moves in the header and they haven't to be of fixed length.
0:Ke2 1:Ke3 2:Ke4 3:Ke5... 47:Pd2 48:Pd3 49:Pd4... 101:Pe8Q...
and the book itself:
[hash]1 [hash]4 [hash]35 [hash]47
The possible moves could be a lot but for a big book the compression achived could give you still a smaller file.
This method is compatible with any game, because you are not limited in the size of the moves in the header and they haven't to be of fixed length.
-
hgm
- Posts: 28502
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Polyglot-format books: large boards
Indeed. The header idea was good, but so far I never implemented it in XBoard, because I had no application for the information that would be in the header.Michel wrote:Perhaps it is time to introduce a header for polyglot books? The format I was proposing
http://hardy.uhasselt.be/Toga/pgheader- ... eader.html
is probably over engineered. But that does not mean a header is a bad idea.
I can implement support in the form of low-level editing of the header text. Currently XBoard's Edit Book feature takes all entries with the key of the current board position, and displays them interpreted as moves. And then writes them back after converting the text in the window to a list of moves + weights. The only thing that has to change is that for key=0 it skips the conversion, and just copies the text in the window directly from / to the non-key fields. Key 0 can be generated from an empty board. So to inspect or edit the header you would then just clear the board, and open the Edit Book dialog. Then you could put any text you want there.
I should define a set of keywords that can be used in the header to define the parameters of the new move-encoding format, such as number of drop moves and 'universal leaps' and to which piece types these apply, meaning and number of the special-move codes etc.
-
hgm
- Posts: 28502
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Polyglot-format books: large boards
Well, if you would encode SAN-type moves (i.e. piece + toSquare) this doesn't give much savings, as usually any piece can go anywhere, depending on where it is currently located. So the number of codes needed would be numberOfPieces x boardSize. When you encode the move step, rather than the absolute to-square, however, the savings can be huge, as a Knight can only make 8 steps, even on a 25x25 board.stegemma wrote:Another possible solution, that is not retro-compatible, could be to store in the header all the possible moves that appears in the book and then just use the index to select the one that you need in any path of the book tree. It is a kind of compression, For sample, the header can contains:
0:Ke2 1:Ke3 2:Ke4 3:Ke5... 47:Pd2 48:Pd3 49:Pd4... 101:Pe8Q...
and the book itself:
[hash]1 [hash]4 [hash]35 [hash]47
The possible moves could be a lot but for a big book the compression achived could give you still a smaller file.
This method is compatible with any game, because you are not limited in the size of the moves in the header and they haven't to be of fixed length.
So the savings compared to the (pieceIndex, boardStep) system can be substantial, as the latter makes it possible to encode any board step in combination with any piece. It could be a bit tedious to list every possible step of every piece type, though.
So perhaps a good compromise is to divide pieces in a number of classes, some of which have predefined sets of moves (such that they don't have to be explicitly listed in the book header). E.g. class 1 could be pieces that only have King moves, and each piece in that class gets 8 codes assigned. Class 2 would be 'short-range leapers', (e.g. Knight) which reach at most 2 squares far, and would get 24 codes. Class 3 could be sliders, such as Q, needing 4*(boardLength-1) codes. Class 4 would be slider + Knight jumps (e.g. Amazon), 4*(boardLength+1) codes. Class 5 could be 'Universal' (moves anywhere, such as Hook Movers or Tai Shogi Emperor), class 6 drops (similar to universal, but implying a piece type rather than a piece on the board). Perhaps there should be a separate class for Pawns too, having only 4 moves.
The book header would just have to list which piece types belong to which class, and perhaps how many pieces of that class can be maximally expected. (It would be a bit annoying if the encoding depended on the actual number of pieces present on the board, and would thus be different for any position. The only goal of compactness is to make the worst case fit, and once that fits, further compression is meaningless, and only creates unused space.) E.g. for Chess
Code: Select all
Pawns:8:P
Steppers:1:K
Leapers:3:N
Sliders:6:BRQ
After that the moves not captured by the default classes could be explicitly listed. E.g. by mentioning all piece types that could make them, and then the move (assumed to be translation invariant). E.g. if the variant involved a (1,3)-leaper (Camel, C), and a Camel/Rook compound (T), the header would need the extra lines
Code: Select all
Special:
e4f7:4:TC
e4d7:4:TC
e4h5:4:TC
e4h3:4:TC
e4f1:4:TC
e4d1:4:TC
e4b3:4:TC
e4b5:4:TC
Promotions would have to be specified in the special-move section:
Code: Select all
Special:
e7e8q:8:P
e7f8q:8:P
e7d8q:8:P
e7e8n:8:P
e7f8n:8:P
e7d8n:8:P
Perhaps a special character prefixed to a move could be used to indicate the move is not translation invariant, but absolute. Like
Code: Select all
Special:
!e1b1,a1d1:1:K
!e8b1,a8d8:1:K
-
stegemma
- Posts: 859
- Joined: Mon Aug 10, 2009 10:05 pm
- Location: Italy
- Full name: Stefano Gemma
Re: Polyglot-format books: large boards
Of course you have to store moves for any type (class) of piece, not for any single piece. Because you can store moves in the header, it is less important to save bytes, if this saving would raise complexity. Because not all the moves are possible or occurs in the book, the saving could be very big. For sample, in standard chess game have a few rook or queens moves, in opening, and almost only castling for the kings. A move like Ke4-e5 would never appears so this move is not in the header.hgm wrote:[...]
Well, if you would encode SAN-type moves (i.e. piece + toSquare) this doesn't give much savings, as usually any piece can go anywhere, depending on where it is currently located. So the number of codes needed would be numberOfPieces x boardSize.[...]
From the book creator software, building the header will be very easy, the same if have to rebuild that header from an existing book.
Of course, this will be something else than actual polyglot, maybe, but handling such a book format is nothing more than use existing moving routine from just "read the move" to "read the index then the move".
-
hgm
- Posts: 28502
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Polyglot-format books: large boards
Ah, OK, you want to only reserve codes for moves that are actually in the book. I guess that for openings this could indeed have advantages, as pieces always start in the same place, and have just a few natural paths for their development. (If it is not a shuffle game, that is.)