I developed a compression technique for public-key certificate revocation lists (i.e. web security) but I noticed that the interface is similar to something that occurs in chess programming, so I wonder if anyone is interested.
The idea is to compress a dictionary of type K -> V, where K is an arbitrary set of keys. V can be either uniform power-of-2 data (in which case the technique achieves essentially optimal compression) or a small-ish domain (say, < 1000 items) which are not necessarily uniformly distributed (in which case it's at most the entropy of the V distribution + 11%). Compression is somewhat slow, taking maybe 1µs per KVP, but lookups are relatively fast, taking only a few probes into a table and as little as 13 ns (using a weak hash on M1 processor). In general it takes 1-2 probes for the uniform case, and potentially more for the non-uniform case.
The compression technique allows you to query the dictionary on any one of its keys, but not to determine what the keys are (i.e. it doesn't encode the key set, to save space). If you query it on a key that's not in the dictionary, you get an arbitrary answer. So for example, if you have a database of positions, and it excludes illegal positions or those which would be excluded by an earlier search step, then querying it on one of those positions would give a nonsensical answer -- but it doesn't matter because you won't do that.
I did some initial study to compare this to Syzygy endgame tables, which I admittedly don't entirely understand. It seems to me that Syzygy can get better compression on WDL tables, for two reasons:
- It breaks the data into shards that are similar, and therefore have reduced entropy; this could easily be replicated (perhaps even better) with my new technique.
- It can encode the key in a way that's dense, whereas my compression does not assume that the key is dense; this may avoid paying the <= 11% overhead for the nonuniform case, but otherwise it doesn't help.
Anyway, I'm not experienced in chess programming, but if this technique would be useful in a chess engine (perhaps with improvements), please let me know so that I can add code to support this.
The code is online, in prototype form, at https://github.com/bitwiseshiftleft/frayed_cascade.