FEN compression

odomobo · Post by **odomobo** » Wed Mar 17, 2021 9:10 pm

PSA that with the Ryzen 5000 series, AMD apparently fixed PDEP/PEXT

https://www.anandtech.com/show/16214/am ... x-tested/6

Dann Corbit · Post by **Dann Corbit** » Thu Mar 18, 2021 2:54 am

Specifically:

Code: Select all

Instruction	Zen2	                                  Zen3
PDEP/PEXT	300 cycle latency 250 cycles per clock	3 cycle latency 1 per clock

lucasart · Post by **lucasart** » Thu Mar 18, 2021 4:31 am

Dann Corbit wrote: ↑Thu Mar 18, 2021 2:54 am Specifically:
Code: Select all
Instruction	Zen2	                                  Zen3
PDEP/PEXT	300 cycle latency 250 cycles per clock	3 cycle latency 1 per clock

Definitely my next CPU when I finish frying my Intel i7.

Dann Corbit · Post by **Dann Corbit** » Thu Mar 18, 2021 5:23 am

lucasart wrote: ↑Thu Mar 18, 2021 4:31 am
Dann Corbit wrote: ↑Thu Mar 18, 2021 2:54 am Specifically:
Code: Select all
Instruction	Zen2	                                  Zen3
PDEP/PEXT	300 cycle latency 250 cycles per clock	3 cycle latency 1 per clock
Definitely my next CPU when I finish frying my Intel i7.

I am going to wait one (or possibly 2) more generations. Of course, I already have a 3970x and it has some life in it yet.
But I figure there will be a big shrink in either one more, or two more generations.
That might double the cores again and also boost the speed a lot at the same time.

lucasart · Post by **lucasart** » Thu Mar 18, 2021 5:36 am

Update: removed useless ep bit.

Code: Select all

itemname                   | bits           | definition
--------------------------------------------|-----------
occupied                   | 64             |                  
pext(white,occupied)       | cnt(occupied)  | 
black = occupied & ~white  | 0              |
pext(pawns, occupied)      | cnt(occupied)  | 
pext(knights, remaining)   | cnt(remaining) | remaining = occupied & ~pawns
pext(bishops, remaining)   | cnt(remaining) | remaining &= ~bishops
pext(rooks, remaining)     | cnt(remaining) | remaining &= ~rooks        
pext(queens, remaining)    | cnt(remaining) | remaining &= ~queens
kings = remaining          | 0              |
turn                       | 1              |
pext(epSquare, candidates) | cnt(candidates)| candidates = turn == black ? ((white & pawns & rank4) >> 8) : ((black & pawns & rank6) << 8)
pext(castleRooks, rooks)   | cnt(rooks)     | castleRooks = rooks with associated castling rights (any color)
rule50                     | 7              |
fullMove                   | 10             | useless: conveys no position information

I think this algorithm is provably optimal, at least when the metric is the worst case. The proof is basically obvious by just following each step of the algorithm.

Nitpick: if you assume that positions are legal, then you can reduce the cnt(candidates), in cases where you can infer that the double push would have been illegal. You can also infer turn when there is a check on the board. All this complicates code for negligible benefit in reducing avg(bits), so probably not worth it.

Sesse · Post by **Sesse** » Sat Mar 20, 2021 4:13 pm

lucasart wrote: ↑Thu Mar 18, 2021 5:36 am Nitpick: if you assume that positions are legal, then you can reduce the cnt(candidates), in cases where you can infer that the double push would have been illegal. You can also infer turn when there is a check on the board. All this complicates code for negligible benefit in reducing avg(bits), so probably not worth it.

I don't think the proof is obvious by any means; you need to prove that your bit technique is more effective than any run-length or arithmetic coding option (it obviously wouldn't be if the board were bigger or the number of pieces were fewer). Also, you need to prove that the extra information from legality (e.g. side to move cannot be in check) cannot be used in any better way, e.g. by encoding the king's position earlier in the sequence and using that to reduce the number of bits you need to encode for other pieces.

Tord · Post by **Tord** » Thu Mar 25, 2021 4:13 pm

I use this one, which (if I understand his code correctly) is quite similar to Andrew's:

Code: Select all

"""
    compress(b::Board)

Compresses a board into a byte array of maximum 26 bytes.

Use the inverse function `decompress` to recover a compressed board.
"""
function compress(b::Board)::Vector{UInt8}
    io = IOBuffer(UInt8[], read = true, write = true, maxsize = 26)

    # Write occupied squares (8 bytes):
    write(io, occupiedsquares(b).val)

    # Write square contents, one nibble per occupied square:
    nibble = 0
    for s ∈ occupiedsquares(b)
        # HACK: Encode a pawn that can be captured en passant differently
        if pieceon(b, s) == PIECE_WP && s == epsquare(b) + DELTA_N
            x = UInt8(7)
        elseif pieceon(b, s) == PIECE_BP && s == epsquare(b) + DELTA_S
            x = UInt8(15)
        else
            x = UInt8(pieceon(b, s).val)
        end
        if nibble == 0
            nibble = x
        else
            write(io, UInt8((nibble << 4) | x))
            nibble = 0
        end
    end
    if nibble ≠ 0
        write(io, UInt8(nibble << 4))
    end

    # Pack side to move and castle rights into one byte:
    write(io, UInt8((b.side - 1) | (b.castlerights << 1)))

    # Write castle files (for Chess960):
    write(io, UInt8(b.castlefiles[1] | (b.castlefiles[2] << 4)))

    take!(io)
end

The maximum of 26 bytes could be pushed down to 24 by having the white and black kings be separate piece types and by having rooks that can castle be a separate piece type, but I think the increased ugliness just isn't worth it for just two bytes.

lucasart · Post by **lucasart** » Tue Apr 13, 2021 2:07 am

AndrewGrant wrote: ↑Wed Mar 17, 2021 2:39 am Not nearly as good as yours, although trivial to decompress and matches my internal use during training. Load times are pretty fast, to the point where speeding it up is a non-factor for me, but I imagine your scheme would be faster, trading off disk reads for decompression.
Code: Select all
Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
White King Sq : 1-bytes
Black King Sq : 1-bytes
Piece Count   : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

= Average of 26-bytes, max of 30-bytes.

So you train on eval not game result ? I've implemented both models in Demolito, for tuning the HCE (hand crafted eval):

eval + tempo = alpha + beta * deepEval + error
result = 1 / (1 + exp(-(eval + tempo))) + error

In both cases, I minimize mean(abs(error)).

Neither gave me anything, because Demolito's HCE is already tuned to death, using reinforced learning with Joona Kiiski's SPSA script https://github.com/zamar/spsa. But experimenting clearly showed that the logistic regression on game result was better.

Are you using this for tuning an HCE (Ethereal) or NN eval (Halogen) ?

Btw, I think you can save 24-bits, without complicating code or using PEXT/PDEP:

Code: Select all

Occupancy     : 8-bytes
Evaluation    : 2-bytes
Turn          : 1-bytes
Packed Pieces : 1-bytes per two pieces (16 max, 12 average)

The point is that a nibble (4-bits) can encode (piece,color) for any piece including the king (6 pieces * 2 colors = 12 values). So you don't need white/black king squares in separate fields. ALso, you don't need to store the piece count, because popcnt(occupancy) tells you how many packed pieces are in the array (for variable length encoding, where you store (popcnt(occupancy)+1)/2 elements instead of 16).

tromp · Post by **tromp** » Tue Jun 22, 2021 1:30 pm

I'm working on a FEN compressor that maps any FEN describing a reachable position into a non-negative number under N=8726713169886222032347729969256422370854716254 (~8.7E45), which amounts to ~19.1 bytes.
Position information includes side to move, en-passant status, castling rights, but no 50-move rule or evaluation information.

AndrewGrant · Post by **AndrewGrant** » Tue Jun 22, 2021 1:39 pm

tromp wrote: ↑Tue Jun 22, 2021 1:30 pm I'm working on a FEN compressor that maps any FEN describing a reachable position into a non-negative number under N=8726713169886222032347729969256422370854716254 (~8.7E45), which amounts to ~19.1 bytes.
Position information includes side to move, en-passant status, castling rights, but no 50-move rule or evaluation information.

This would be of interest. Keep us posted if you can.

FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression

Re: FEN compression