FEN compression

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

FEN compression

Post by lucasart »

I was looking at the NNUE training code for compressing FEN, which uses Huffman encoding for pieces. See PackedSfen in https://github.com/glinscott/nnue-pytor ... _formats.h

But I think we can do better than this, using pdep (for compression) and pext (for decompression).

Compression algorithm

Code: Select all

itemname                   | bits           | definition
--------------------------------------------|-----------
occupied                   | 64             |                  
pdep(pawns, occupied)      | cnt(occupied)  | 
pdep(knights, remaining)   | cnt(remaining) | remaining = occupied & ~pawns
pdep(bishops, remaining)   | cnt(remaining) | remaining &= ~bishops
pdep(rooks, remaining)     | cnt(remaining) | remaining &= ~rooks        
pdep(queens, remaining)    | cnt(remaining) | remaining &= ~queens
kings = remaining          | 0              |
turn                       | 1              |
ep                         | 1              |
pdep(epSquare, candidates) | cnt(candidates)| candidates = ((white & pawns & rank4) << 8) | ((black & pawns & rank6) >> 8)
pdep(castleRooks, rooks)   | cnt(rooks)     | castleRooks = rooks with associated castling rights (any color)
rule50                     | 7              |
fullMove                   | 10             | useless: conveys no position information
Note the use of castleRooks which handles Chess960 nicely (and Chess as a particular case of Chess960).

It's difficult to put upper bounds on these cnt(). And those will tend to be way higher than actual averages on real samples. Only implementing it and compressing large random FEN collections will tell the practical compression ratio.

Decompression algorithm
Left as an exercise to the reader :wink:

Is this useful? Absolutely not. But it's fun!
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: FEN compression

Post by mar »

I don't know about PDEP/PEXT but from the docs it seems you actually want pext for compression and pdep for decompression?
Nice idea - the point is the trainer compresses FENs this way in memory I suppose.
Where do you handle (piece) color? I guess you could easily encode say black pieces in a similar way of what you already do.

Your PEXT/PDEP should be faster, because decompressing Huffman bit by bit is not the best idea ever; even though most data will be empty squares.
One could probably do better by using a simple LUT, say grab byte, lookup, store symbols, return excess bits to accumulator, repeat.
Maybe even multiple symbols at once might work, but would it be worth the effort and the extra complexity? probably not, assuming it's not really about performance.
Martin Sedlak
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: FEN compression

Post by AndrewGrant »

Not nearly as good as yours, although trivial to decompress and matches my internal use during training. Load times are pretty fast, to the point where speeding it up is a non-factor for me, but I imagine your scheme would be faster, trading off disk reads for decompression.

Code: Select all

Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
White King Sq : 1-bytes
Black King Sq : 1-bytes
Piece Count   : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

= Average of 26-bytes, max of 30-bytes.

Code: Select all

    while (fgets(line, 256, fin) != NULL) {

        boardFromFEN(board, line, 0);

        char *tail   = strstr(line, "] ");
        int16_t eval = atoi(tail + strlen("] "));
        int8_t turn  = board->turn;

        uint64_t white  = board->colours[WHITE];
        uint64_t black  = board->colours[BLACK];
        uint64_t pieces = white | black;

        uint8_t count = popcount(pieces);
        uint8_t wksq  = getlsb(white & board->pieces[KING]);
        uint8_t bksq  = getlsb(black & board->pieces[KING]);

        uint8_t types[32] = {0};
        uint8_t packed[16] = {0};

        fwrite(&pieces, sizeof(uint64_t), 1, fout);
        fwrite(&eval,   sizeof(int16_t ), 1, fout);
        fwrite(&turn,   sizeof(uint8_t ), 1, fout);
        fwrite(&wksq,   sizeof(uint8_t ), 1, fout);
        fwrite(&bksq,   sizeof(uint8_t ), 1, fout);
        fwrite(&count,  sizeof(uint8_t ), 1, fout);

        for (int i = 0; pieces; i++) {
            int sq = poplsb(&pieces);
            types[i] = encode_piece(board->squares[sq]);
        }

        for (int i = 0; i < 16; i++)
            packed[i] = pack_pieces(types[i*2], types[i*2+1]);

        fwrite(packed, sizeof(uint8_t), (count + 1) / 2, fout);
    }
Guess I'll add that I have Ryzens, so PEXT/PDEP are kinda no-gos
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: FEN compression

Post by lucasart »

mar wrote: Wed Mar 17, 2021 2:31 am I don't know about PDEP/PEXT but from the docs it seems you actually want pext for compression and pdep for decompression?
Nice idea - the point is the trainer compresses FENs this way in memory I suppose.
Where do you handle (piece) color? I guess you could easily encode say black pieces in a similar way of what you already do.
Good points. Forgot color and inverted pdep/pext. For color, we just need cnt(occupied) bits to store pext(white, occupied) (and deduce black as occupied & ~white).

Corrected version

Code: Select all

itemname                   | bits           | definition
--------------------------------------------|-----------
occupied                   | 64             |                  
pext(white,occupied)       | cnt(occupied)  | 
black = occupied & ~white  | 0              |
pext(pawns, occupied)      | cnt(occupied)  | 
pext(knights, remaining)   | cnt(remaining) | remaining = occupied & ~pawns
pext(bishops, remaining)   | cnt(remaining) | remaining &= ~bishops
pext(rooks, remaining)     | cnt(remaining) | remaining &= ~rooks        
pext(queens, remaining)    | cnt(remaining) | remaining &= ~queens
kings = remaining          | 0              |
turn                       | 1              |
ep                         | 1              |
pext(epSquare, candidates) | cnt(candidates)| candidates = ((white & pawns & rank4) << 8) | ((black & pawns & rank6) >> 8)
pext(castleRooks, rooks)   | cnt(rooks)     | castleRooks = rooks with associated castling rights (any color)
rule50                     | 7              |
fullMove                   | 10             | useless: conveys no position information
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: FEN compression

Post by lucasart »

AndrewGrant wrote: Wed Mar 17, 2021 2:39 am Not nearly as good as yours, although trivial to decompress and matches my internal use during training. Load times are pretty fast, to the point where speeding it up is a non-factor for me, but I imagine your scheme would be faster, trading off disk reads for decompression.

Code: Select all

Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
White King Sq : 1-bytes
Black King Sq : 1-bytes
Piece Count   : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

= Average of 26-bytes, max of 30-bytes.

Code: Select all

    while (fgets(line, 256, fin) != NULL) {

        boardFromFEN(board, line, 0);

        char *tail   = strstr(line, "] ");
        int16_t eval = atoi(tail + strlen("] "));
        int8_t turn  = board->turn;

        uint64_t white  = board->colours[WHITE];
        uint64_t black  = board->colours[BLACK];
        uint64_t pieces = white | black;

        uint8_t count = popcount(pieces);
        uint8_t wksq  = getlsb(white & board->pieces[KING]);
        uint8_t bksq  = getlsb(black & board->pieces[KING]);

        uint8_t types[32] = {0};
        uint8_t packed[16] = {0};

        fwrite(&pieces, sizeof(uint64_t), 1, fout);
        fwrite(&eval,   sizeof(int16_t ), 1, fout);
        fwrite(&turn,   sizeof(uint8_t ), 1, fout);
        fwrite(&wksq,   sizeof(uint8_t ), 1, fout);
        fwrite(&bksq,   sizeof(uint8_t ), 1, fout);
        fwrite(&count,  sizeof(uint8_t ), 1, fout);

        for (int i = 0; pieces; i++) {
            int sq = poplsb(&pieces);
            types[i] = encode_piece(board->squares[sq]);
        }

        for (int i = 0; i < 16; i++)
            packed[i] = pack_pieces(types[i*2], types[i*2+1]);

        fwrite(packed, sizeof(uint8_t), (count + 1) / 2, fout);
    }
Guess I'll add that I have Ryzens, so PEXT/PDEP are kinda no-gos
This is already better, and simpler, than the one from NNUE folks. And yes, pext/pdep are a massive pain to compute in software. Given that Intel is being obsoleted by AMD, and x86 is also being obsoleted by ARM, building the future on pext/pdep is probably not such a good idea…
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
Dann Corbit
Posts: 12537
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: FEN compression

Post by Dann Corbit »

I believe Uri Blass wrote a FEN compressor that squeezes a fen below 165 bits
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: FEN compression

Post by lucasart »

Dann Corbit wrote: Wed Mar 17, 2021 5:58 am I believe Uri Blass wrote a FEN compressor that squeezes a fen below 165 bits
I ran the experience with my pext algorithm, playing 100 games (demolito vs. itself at depth=8), and computing the average compressed size for all encountered positions. I ran it twice: with and without adjudication (no adjudication is favorable, of course, as games end in long sequence of positions that contain little information with few pieces):
* no adjudication: avg = 138 bits
* resign (3 moves 700bp) + draw (8 moves 10cp): avg = 142 bits

Beat that 8-)

PS: In fact, it's even 10 bits less than stated above, once you remove the fullMove, which is totally useless (it's just an annotation, contains zero position information relevant to tuning an eval for example).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: FEN compression

Post by AndrewGrant »

lucasart wrote: Wed Mar 17, 2021 5:04 am
AndrewGrant wrote: Wed Mar 17, 2021 2:39 am

Code: Select all

Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
White King Sq : 1-bytes
Black King Sq : 1-bytes
Piece Count   : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

= Average of 26-bytes, max of 30-bytes.
This is already better, and simpler, than the one from NNUE folks. And yes, pext/pdep are a massive pain to compute in software. Given that Intel is being obsoleted by AMD, and x86 is also being obsoleted by ARM, building the future on pext/pdep is probably not such a good idea…
Really? I looked at that file but its a few thousand lines and it was not immediately clear where to look for the compression scheme.
Do you know off hand if the Stockfish trainer continually reloads the data back into memory, or if it is a one-time process?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
lucasart
Posts: 3232
Joined: Mon May 31, 2010 1:29 pm
Full name: lucasart

Re: FEN compression

Post by lucasart »

Actually, it can be improved even further, because the candidate squares for en-passant are only of the opposite(turn).

Code: Select all

itemname                   | bits           | definition
--------------------------------------------|-----------
occupied                   | 64             |                  
pext(white,occupied)       | cnt(occupied)  | 
black = occupied & ~white  | 0              |
pext(pawns, occupied)      | cnt(occupied)  | 
pext(knights, remaining)   | cnt(remaining) | remaining = occupied & ~pawns
pext(bishops, remaining)   | cnt(remaining) | remaining &= ~bishops
pext(rooks, remaining)     | cnt(remaining) | remaining &= ~rooks        
pext(queens, remaining)    | cnt(remaining) | remaining &= ~queens
kings = remaining          | 0              |
turn                       | 1              |
ep                         | 1              |
pext(epSquare, candidates) | cnt(candidates)| candidates = turn == black ? ((white & pawns & rank4) >> 8) : ((black & pawns & rank6) << 8)
pext(castleRooks, rooks)   | cnt(rooks)     | castleRooks = rooks with associated castling rights (any color)
rule50                     | 7              |
fullMove                   | 10             | useless: conveys no position information
This will probably average 1 bit lower, or 124 bits with adjudication and 120 without, once you exclude the non-position part (rule50, fullMove).
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
gladius
Posts: 568
Joined: Tue Dec 12, 2006 10:10 am
Full name: Gary Linscott

Re: FEN compression

Post by gladius »

AndrewGrant wrote: Wed Mar 17, 2021 10:15 am
lucasart wrote: Wed Mar 17, 2021 5:04 am
AndrewGrant wrote: Wed Mar 17, 2021 2:39 am

Code: Select all

Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
White King Sq : 1-bytes
Black King Sq : 1-bytes
Piece Count   : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

= Average of 26-bytes, max of 30-bytes.
This is already better, and simpler, than the one from NNUE folks. And yes, pext/pdep are a massive pain to compute in software. Given that Intel is being obsoleted by AMD, and x86 is also being obsoleted by ARM, building the future on pext/pdep is probably not such a good idea…
Really? I looked at that file but its a few thousand lines and it was not immediately clear where to look for the compression scheme.
Do you know off hand if the Stockfish trainer continually reloads the data back into memory, or if it is a one-time process?
It continually loads the data from disk. Each batch is thrown away after being used.