Reducing tablebase size with search

rreagan · Post by **rreagan** » Mon Apr 07, 2014 4:36 am

Is it possible to reduce the size of tablebases by combining a shallow search? For example, if you only stored 50% of the positions, then with a shallow search could you still get the desired result? In theory, if you stored the correct 50% of the positions you could find the correct result from a 0-ply or 1-ply search (is this right?). If this is true, is it possible to extend this? For example would a 2-ply search reduce the theoretically minimum tablebase size to 25%, a 3-ply search to 12.5%, and so on?

bob · Post by **bob** » Mon Apr 07, 2014 4:40 am

rreagan wrote:Is it possible to reduce the size of tablebases by combining a shallow search? For example, if you only stored 50% of the positions, then with a shallow search could you still get the desired result? In theory, if you stored the correct 50% of the positions you could find the correct result from a 0-ply or 1-ply search (is this right?). If this is true, is it possible to extend this? For example would a 2-ply search reduce the theoretically minimum tablebase size to 25%, a 3-ply search to 12.5%, and so on?

How would you access them? Currently it is just a direct probe to get the value for a specific set of pieces on specific squares, with each table holding ALL possible positions (ignoring the tricks of restricting the king to 10 squares when no pawns are present, etc).

rreagan · Post by **rreagan** » Mon Apr 07, 2014 5:13 am

bob wrote: How would you access them? Currently it is just a direct probe to get the value for a specific set of pieces on specific squares, with each table holding ALL possible positions (ignoring the tricks of restricting the king to 10 squares when no pawns are present, etc).

Currently is the lookup just an index into the table of all positions (as opposed to a database which may be sparsely populated)?

syzygy · Post by **syzygy** » Mon Apr 07, 2014 8:07 am

rreagan wrote:Is it possible to reduce the size of tablebases by combining a shallow search? For example, if you only stored 50% of the positions, then with a shallow search could you still get the desired result? In theory, if you stored the correct 50% of the positions you could find the correct result from a 0-ply or 1-ply search (is this right?). If this is true, is it possible to extend this? For example would a 2-ply search reduce the theoretically minimum tablebase size to 25%, a 3-ply search to 12.5%, and so on?

The question "is it possible to reduce the size of tablebases by [idea]" usually has one of three answers:
- no, it is impossible, the idea is flawed.
- no, the idea in principle works, but it is too computationally expensive.
- no, the idea works but it is already being done, so it won't reduce the size of tablebases any further.

If you allow a 1-ply search, you can leave out either the white-to-move or black-to-move table. In my tablebases I do this in the DTZ-files (which are only accessed at the root). So this saves 50% and in fact more: you can leave out the half that compresses worst.

Going further is not possible by simply "leaving out" positions as the positions don't let themselves be partitioned into nice subsets. There is simply not a set of 12.5% of all positions that determine the result of all other positions with a 3-ply search.

What you can do is this. Instead of probing the table, first do an N-ply search and see if you can determine the result of the position based on probes in subtables (after capture, maybe after promotion). If you can, then you don't need to probe the table anymore. If you cannot, you need to probe the table still.

What does this help? For the position that you can resolve by performing an N-ply search you can store a random value. You can choose a value that improves compression.

This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.

Edmund · Post by **Edmund** » Mon Apr 07, 2014 8:46 am

syzygy wrote:This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.

Expensive, but depending on the use case still relevant. If you only need to store the values for positions, where an engine predicts a wrong value you will get decreasing size for increasing search depth (better compression with increasingly better predictors).
If someone only wants to probe tablebases at the root, then why not?

bob · Post by **bob** » Mon Apr 07, 2014 3:36 pm

rreagan wrote:
bob wrote: How would you access them? Currently it is just a direct probe to get the value for a specific set of pieces on specific squares, with each table holding ALL possible positions (ignoring the tricks of restricting the king to 10 squares when no pawns are present, etc).
Currently is the lookup just an index into the table of all positions (as opposed to a database which may be sparsely populated)?

yes no search at all.

syzygy · Post by **syzygy** » Mon Apr 07, 2014 8:27 pm

Edmund wrote:
syzygy wrote:This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.
Expensive, but depending on the use case still relevant. If you only need to store the values for positions, where an engine predicts a wrong value you will get decreasing size for increasing search depth (better compression with increasingly better predictors).
If someone only wants to probe tablebases at the root, then why not?

True, there is a trade-off to be made and where the right balance lies depends on the circumstances.

Of course if time is not an issue, tablebases can be generated on the fly and would then occupy only the size of the generator.

Rein Halbersma · Post by **Rein Halbersma** » Mon Apr 07, 2014 10:15 pm

syzygy wrote:
Edmund wrote:
syzygy wrote:This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.
Expensive, but depending on the use case still relevant. If you only need to store the values for positions, where an engine predicts a wrong value you will get decreasing size for increasing search depth (better compression with increasingly better predictors).
If someone only wants to probe tablebases at the root, then why not?
True, there is a trade-off to be made and where the right balance lies depends on the circumstances.

In 8x8 checkers, it is common that both capture and threatened capture positions are not probed. They are still stored in the uncompressed db, but are encoded with a bogus value such as to maximize the current run of their adjacent values. This can signicantly improve the overall compression rate. The Chinook papers have all the details.

Whenever you probe a capture or threatened capture position, you then do a search until you reach a quiet position for which the actual WDL or DTC value is stored in the db. Because in 8x8 checkers captures are compulsory, kings only capture short-ranged, and men only capture forward, the search tree is rather small and this tradeoff is good for performance. For 10x10 draughts with long-range king captures and backward-capturing men, this optimization does not work as well because the search trees become too large.

syzygy · Post by **syzygy** » Mon Apr 07, 2014 11:35 pm

Rein Halbersma wrote:
syzygy wrote:
Edmund wrote:
syzygy wrote:This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.
Expensive, but depending on the use case still relevant. If you only need to store the values for positions, where an engine predicts a wrong value you will get decreasing size for increasing search depth (better compression with increasingly better predictors).
If someone only wants to probe tablebases at the root, then why not?
True, there is a trade-off to be made and where the right balance lies depends on the circumstances.
In 8x8 checkers, it is common that both capture and threatened capture positions are not probed. They are still stored in the uncompressed db, but are encoded with a bogus value such as to maximize the current run of their adjacent values. This can signicantly improve the overall compression rate. The Chinook papers have all the details.

I compress out "threat positions" in my suicide tablebases (suicide chess has compulsory captures), but I am not sure that I do the same thing. Or actually, I am pretty sure that I don't do the same thing.

What I do is this:
- if the position has one or more captures, play out the captures (resulting in probes into smaller tables). The value stored can be any value (whichever compresses best).
- if the position does not have a capture, try out all "threat" moves that result in a position where the opponent can (and therefore must) capture. If one of the threat moves is winning (i.e. leads to a position in which all the captures by the opponent lose), then any value can be stored (whichever compresses best). If the best threat move draws and the position is a draw, then we can store either a draw or a loss (whichever compresses best).

Finding the values that lead to best compression in my compression scheme is quite likely NP-hard, but I found some heuristics that do quite well.

I only compress out "threat positions" in my 6-piece (suicide) tablebases, as it does slow down accesses (leading to a small search explosion if it needs to be done recursively). It saves about 50%. (Many tables are reduced to zero: all positions that do not have a capture or a winning or drawing threat move are losses.)

I probably got the idea from Building the Checkers 10-piece Endgame Databases, but what this paper describes seems a bit different:

Whenever you probe a capture or threatened capture position, you then do a search until you reach a quiet position for which the actual WDL or DTC value is stored in the db.

What is a "threatened capture position"? A position where the side not to move would have a capture?

If that's the case, then this trick seems similar to the removal of in-check positions (which I don't do, but which most likely is done by Shredder and I believe by some versions of Robbobases).

Because in 8x8 checkers captures are compulsory, kings only capture short-ranged, and men only capture forward, the search tree is rather small and this tradeoff is good for performance. For 10x10 draughts with long-range king captures and backward-capturing men, this optimization does not work as well because the search trees become too large.

Do I understand correctly that those search trees are full of probes into the same database?

rreagan · Post by **rreagan** » Tue Apr 08, 2014 2:07 am

Edmund wrote:
syzygy wrote:This idea works, but it is already being used at least in my tables. N=1 and just for captures, otherwise it will get too expensive.
Expensive, but depending on the use case still relevant.

Yes, I was actually curious how this might affect chess analysis from the human perspective. If such a method were viable, we could know optimal play in more endgames. In that case who cares if it takes a week or a month or a year to handle all of the probes and searches if we learn something we didn't know before?

Reducing tablebase size with search

Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search

Re: Reducing tablebase size with search