Transposition table storage for tablebase assistance

bob · Post by **bob** » Sun Jul 19, 2009 11:08 pm

sje wrote:
bob wrote:When you try on-the-fly decompression as we did in the Nalimov tables, we (Eugene and I) were somewhat surprised that decompression was far faster overall, because you end up doing less file I/O because you are reading in compressed blocks. Less I/O is always good.
The optimality of the compression strategy to be used, or even if compression is to be used at all, is dependent upon the tablebase class, the end user's processor, the memory bandwidth, the disk speed, etc. Therefore, the strategy should be variable and this is the approach I'm taking with the Edwards' Version Two tablebases, possibly to appear early next year.

I suspect you will find the same thing we found. Even using my 15K U320 scsi drives on my office machine, running with compression was significantly faster. And this on a PIV/2.8ghz-based processor. The faster the processor, the wider this gap gets. And since not many will be runing U320 SCSI drives with 15K rpm spindle speeds to get latency way down, I do not believe that uncompressed is ever going to perform as well as the on-the-fly-decompression approach.

Note that the above is true for 4-5 piece files, and the difference was even more signifiicant when we tested with 6 piece files.

bob wrote:I still prefer to follow the "order by increasing cost" approach. Repetition check is the cheapest operation I have, and I do that first. Hash probe is next. EGTBs are significantly more expensive and are deferred until the hash probe has failed. Once I do an EGTB probe on a position, it is unlikely I will repeat it since it promptly goes into the hash table with infinite draft making it very difficult to overwrite.
Well, my TB probe approach is also in order of ascending cost where cost equals probability of success times expense of use. The big difference is that the TB values get their own space (half sized entries allow twice as efficient storage). The main transposition table becomes slightly more efficient as it now has more free slots that would be otherwise taken by long-lived TB score entries.

The only downside to Symbolic's current approach is the situation where tablebases are enabled but only a few of the actual TB files are available. This would mean that the TB classifier would be run on TB eligible positions that had no hope of actually being scored, and so a small amount of time would be wasted. This is why the classifier is fairly fast and can be made even faster by using a material hash signature instead of a material piece distribution signature. (Symbolic actually has both of these, but only the distribution signature gets used currently, and it gets used in many places.)

bob · Post by **bob** » Sun Jul 19, 2009 11:10 pm

sje wrote:
bob wrote:Repetition check is the cheapest operation I have, and I do that first. Hash probe is next. EGTBs are significantly more expensive and are deferred until the hash probe has failed.
Symbolic:

1) User interrupt check

This looks at a volatile boolean variable that might be set by another thread; if set the search unwinds and exits.

2) General limit check

If the search has a time check in force, then the tick count variable that gives microseconds since program start is tested against the limit value for "try another node". The tick count is maintained by a separate thread activated by a ten millisecond resolution system interval timer, so there's no per node system call overhead. If the search has a node count limit in force, then that is checked.

3) Maximum ply check

Set at 256.

4) Fifty move draw

Very fast.

5) Insufficient material draw

Almost as fast due to material distribution signature.

6) Repetition draw

Need to roll through a (usually) short loop.

I do 4, 5 and 6 in one piece of code. For 6 I use a normal repetition list for each side, with the loop based on the 50-move counter.

7) TB check

As described earlier.

At this point, if nothing has caught then the node processor figures out which of five different sub-processor routines are to be called. These are:

1) the base (ply zero) routine
2) the check evasion routine (uses the main transposition table)
3) the full width routine (uses the main transposition table)
4) the gainers and checks only routine (uses eval/pawn tables)
5) the gainers only routine (uses eval/pawn tables)

sje · Post by **sje** » Mon Jul 20, 2009 4:10 am

This is Symbolic's C++ routine that counts the prior occurrences of a position.

Legend:

maxcount: Limit of number of prior occurrences to count; either one or two

GetHistoryCount(): Total ply since start of game

hmvc: Half move counter

aphs: All position data hash signature

GetPPDaphs(): Hash from a prior position.

Code: Select all

ui Pos&#58;&#58;CountPriorOccurances&#40;const ui maxcount&#41; const
&#123;
  const ui histcount = GetHistoryCount&#40;), plylimit = MinUnsignedInt&#40;hmvc, histcount&#41;;
  ui count = 0;

  if &#40;plylimit >= &#40;maxcount * 4&#41;)
  &#123;
    ui ply = 3;

    while (&#40;ply < plylimit&#41; && &#40;count < maxcount&#41;)
    &#123;
      if &#40;aphs == hist&#91;histcount - 1 - ply&#93;.GetPPDaphs&#40;)) count++;
      ply += 2;
    &#125;;
  &#125;;
  return count;
&#125;

sje · Post by **sje** » Mon Jul 20, 2009 8:09 am

As you might have guessed,

Code: Select all

typedef unsigned int ui;

Now inline with a ply pair skip on match:

Code: Select all

inline ui Pos&#58;&#58;CountPriorOccurences&#40;const ui maxcount&#41; const
&#123;
  const ui histcount = GetHistoryCount&#40;), plylimit = MinUnsignedInt&#40;hmvc, histcount&#41;;
  ui count = 0;

  if &#40;plylimit >= &#40;maxcount * 4&#41;)
  &#123;
    ui ply = 3;

    while (&#40;ply < plylimit&#41; && &#40;count < maxcount&#41;)
    &#123;
      if &#40;aphs == hist&#91;histcount - 1 - ply&#93;.GetPPDaphs&#40;)) &#123;count++; ply += 4;&#125;
      else
        ply += 2;
    &#125;;
  &#125;;
  return count;
&#125;

hgm · Post by **hgm** » Tue Jul 21, 2009 8:36 pm

If you want to take advantage of the fact that TB probes are smaller, why not simply store two of them in a normal entry of the TB. Most people use 16-byte entries and fit 4 of them in a cache line. If one of these 4 gets filled by a TB hit you could set a flag in the entry to indicate it is a TB entry, making the other half available as an (empty) other one. This hardly would add any time to the probing (which is dominated by the time required to fetch the cache line). And it would automatically adapt the space allocated for TB entries and normal entries to the needs of the prosition, rather than having a fixed and usually sub-optimal allocation.

Transposition table storage for tablebase assistance

Re: Transposition table storage for tablebase assistance

Re: Transposition table storage for tablebase assistance

Repetition check

Faster, and with better spelling

Re: Transposition table storage for tablebase assistance