Problems with TT, sometimes makes blunder moves

Sven · Post by **Sven** » Mon Apr 17, 2017 1:24 pm

tttony wrote:sizeof(hash_entry_s) must be power of two to make the hash (key & size) work, now it's 32 bytes

That follows from your current algorithm to calculate the hash table size in bytes from the requested hash table size in MB, it currently implies a hash table size in bytes that is a power of two since it starts with the value of 2 and continues with further multiplications by 2 or similar operations. For the masking approach (key & (ttsize-1)) to work it is only required that ttsize (which is the number of hash entries) is a power of 2, the hash entry size should not have a direct influence on ttsize. So you might consider to change your algorithm to determine the hash table size in a way that also works for an entry size of, say, 24 bytes. Of course having it fit into 16 bytes would even be better since you can store more information in the same amount of memory.

tttony wrote:The other bug was that the bestmove was not in the previous pv, example:
Code: Select all
..
info depth 4 score cp 795 nodes 3935 qnodes 2637 time 0 nps 3935 hashfull 0 pv f7e5
bestmove f7h6
..
That's because in the ID I have this: (pseudo)
Code: Select all
int bestmove;
int pv&#91;64&#93;;
for &#40;depth = 0; depth < MAX_DEPTH; depth++) &#123;
    score = search&#40;..., pv&#41;; // <--- Here change
    if &#40;timeout&#41;                  // it will not print that pv if timeout
        break;

    PrintUciInfo&#40;..., pv&#41;;
&#125;
bestmove = pv&#91;0&#93;; // here is the best move

printf&#40;"bestmove %s\n", strmove&#40;bestmove&#41;);
I now just save the score in a global var only in the root (ply == 0)

The simple solution would have been to move the "bestmove = pv[0]" statement into the loop above PrintUciInfo() so that you always update bestmove immediately unless you consider your pv as invalid due to timeout.

Here it should be noted that a timeout does not necessarily invalidate the pv. It certainly depends on how you deal with updating the pv within the search (at the root node). As long as you do not update it after searching a root move that has been interrupted by timeout and returns incomplete information, everything is fine, and pv[0] is either
- the best move from the previous iteration (if the timeout came during searching the first root move), or
- the first move from the new iteration (i.e. still the previous best move - if the timeout came after completing the first root move and before finding a different best move), or
- a new best move from the new iteration that may have been found in the meantime before getting the timeout.

tttony wrote:Now I have problem with mate value, not work in all positions

Which problem is that?

Sven · Post by **Sven** » Mon Apr 17, 2017 1:56 pm

Sven Schüle wrote:So you might consider to change your algorithm to determine the hash table size in a way that also works for an entry size of, say, 24 bytes.

Here is an example code including test code:

Code: Select all

#include <cstdio>
#include <cstdint>
#include <cassert>

uint64_t get_ttsize_from_size_MB&#40;int size_MB, int entry_size&#41;
&#123;
    uint64_t ttsize = 1;
    uint64_t requested_size = uint64_t&#40;size_MB&#41; << 20;
    while &#40;2 * ttsize * entry_size <= requested_size&#41; &#123;
        ttsize *= 2;
    &#125;
    assert&#40;ttsize * entry_size <= requested_size&#41;;
    assert&#40;&#40;ttsize % 2&#41; == 0&#41;;
    return ttsize;
&#125;

void test_ttsize&#40;)
&#123;
    for &#40;int entry_size = 8; entry_size <= 40; entry_size += 8&#41; &#123;
        for &#40;int size_MB = 1; size_MB <= 1024; size_MB++) &#123;
            uint64_t ttsize = get_ttsize_from_size_MB&#40;size_MB, entry_size&#41;;
            printf&#40;"entry_size=%2d size_MB=%4d ttsize=%llu\n", entry_size, size_MB, ttsize&#41;;
        &#125;
    &#125;
&#125;

int main&#40;)
&#123;
    test_ttsize&#40;);
    return 0;
&#125;

hgm · Post by **hgm** » Mon Apr 17, 2017 7:02 pm

Indeed. It makes no sense at all to take a power of 2 as hash size in bytes. The only reason I can think of to ever restrict the size of the hash table to certain values, is to give it a number of entries that is a power of 2.

tttony · Post by **tttony** » Tue Apr 18, 2017 1:37 am

@Martin, you right, in fact

I encode the move for 16 bits
Also score it's 16 bits
And I still don't implement age (later with slots)

Code: Select all

Anyway, if your entry is 32 bytes then you can use a bucket of 2 entries &#40;assuming your tt is 64-byte &#91;cacheline&#93; aligned&#41;

I plan to use slots, but not now, I have read a little about the cacheline align, right now too advance for me

Another way but a little complicate it's using only two uint64 fields, key and data, where data it's packed

@HGM

No, it will work with any size of the hash entry. Because C pointer arithmetic works in multiples of the object size. You are not performing the AND operation on the address, but on the array index. As long as 'size' is a power of 2 (minus 1) it works.

Now I understand, I was confused, I read it CPW wrong, I thought that sizeof struct must be power of two, but it's the size that's needs to be power of two

@Sven

Well, now I know how it works, later I will let struct be 16 bytes, but first kill bugs

Now PrintUciInfo(..., pv); it's above if (timeout), I just create a globar var that set the score only for root and now it's working

About the mate value, with HGM position, showed this:

Code: Select all

info depth 31 score cp 31937 nodes 2366502823 qnodes 505449824 time 288149 nps 8212775 hashfull 320 pv e1f2 e8f8 f2f3 f8e7 f3e4 e7d6 e4f5 d6d7 f5e5 d7e7 e2e4 e7e8 e5e6 e8f8 e4e5 f8g7stop
info depth 32 score cp 31933 nodes 2998239232 qnodes 535637449 time 372249 nps 8054392 hashfull 325 pv e1f2 e8f8 f2f3 f8f7 f3e3 f7f6 e3d4 f6e6 d4e4 e6d6
bestmove e1f2

Now I modified the code and now it's working, here the relevant things I modified to make it work:

defines:

Code: Select all

#define INF					&#40;32767&#41;
#define MATE_VALUE			&#40;32000&#41;
#define IS_MATE				&#40;MATE_VALUE - MAX_DEPTH&#41;
#define MAX_EVAL			&#40;29999&#41;

In search when in mate I return:

Code: Select all

if (!legal&#41; &#123;
	return inCheck ? -MATE_VALUE + board->ply &#58; 0;
&#125;

In tt_save():

Code: Select all

if &#40;score >= +IS_MATE&#41; &#123;
	score += ply;
&#125;
else if &#40;score <= -IS_MATE&#41; &#123;
	score -= ply;
&#125;

tt_probe():

Code: Select all

if (*score >= +IS_MATE&#41; &#123;
	*score -= ply;
&#125;
else if (*score <= -IS_MATE&#41; &#123;
	*score += ply;
&#125;

And in PrintUciInfo():

Code: Select all

if &#40;score < -IS_MATE&#41;&#123;
	printf&#40;"mate %1.f ", &#40;float&#41;(-MATE_VALUE - score&#41; / 2&#41;;
&#125;	else if &#40;score > +IS_MATE&#41; &#123;
	printf&#40;"mate %1.f ", &#40;float&#41;(+MATE_VALUE - score + 1&#41; / 2&#41;;
&#125; else 
	printf&#40;"cp %d ", score&#41;;

For now it's showing mate and not the big score

Evert · Post by **Evert** » Tue Apr 18, 2017 10:27 am

tttony wrote: Another way but a little complicate it's using only two uint64 fields, key and data, where data it's packed

Note that if you derive the index from the key, then the bits that you use to derive the key are redundant, so you can even leave those out.
I usually only store the upper 32 bits of the key. Normally you wouldn't use the full 32 lower bits to derive the index, so there are some bits that go unused by the main transposition table (it ends up effectively using ~59 or so bits instead of 64). You can remedy that, of course, but I've never bothered.
They're still used to check for repetitions though.

hgm · Post by **hgm** » Tue Apr 18, 2017 12:51 pm

When data is packed to reduce the entry size below 16 bytes, to squeeze more than 4 in a cache line, the following scheme is very competitive:

Code: Select all

typedef struct &#123; // 8 bytes
  unsigned int lock;
  short int score;
  short int moveAndBoundType;
&#125; HashData;

typedef struct &#123; // 32 bytes
  unsigned char lockExt&#91;3&#93;;
  char filler1;
  HashData data&#91;3&#93;;
  unsigned int depth&#91;3&#93;;
  char filler2;
&#125; HashBucket;

This uses buckets of 32 bytes, each containing 3 entries of 10 bytes. Each entry has a 32-bit 'lock' field for the upper 32 bits of the hash key (assuming the low-order key bits are used to derive the index). But it also contains a lockExt for holding bits 24-31 of the key (some of which will not be used for the index, if the hash size is less than 128GB). Two bits from the 16-bit field containing the move are used as upper-bound and lower-bound flags.

The reason for packing the buckets like this is that the first word fetched from RAM not only contains the lock and lockExt of the first entry, but also the lockExt of the other two. This allows you to rule out there is a hit on those without having to wait for the other words of the cache line to be read from memory. This speeds up miss detection.

Code: Select all

#define H_MOVE 0x3FFF;
#define H_UPPER 0x4000;
#define H_LOWER 0x8000;


// hash probe
unsigned int key = hashKey >> 32;     // takes upper 32 bits
unsigned char keyExt = hashKey >> 24; // takes 4th byte
unsigned int index = key & mask;
HashBucket *bucket = hashTable + index;
int slot = 0;
int hashMove;

if&#40;bucket->data&#91;0&#93;.lock == key && bucket->lockExt&#91;0&#93; == keyExt ||
   bucket->lockExt&#91;slot=1&#93; == keyExt && bucket->data&#91;1&#93;.lock == key ||
   bucket->lockExt&#91;slot=2&#93; == keyExt && bucket->data&#91;2&#93;.lock == key   ) &#123; // hit
  int score = bucket->data&#91;slot&#93;.score;
  int flags = bucket->data&#91;slot&#93;.moveAndBoundType;
  if&#40;&#40;score <= alpha || flags & H_LOWER&#41; &&
     &#40;score >= beta  || flags & H_UPPER&#41;   ) &#123; // usable bounds
    int hashDepth = bucket->depth&#91;slot&#93;;
    ...
  &#125;
  hashMove = flags & H_MOVE;
&#125; else hashMove = INVALID, slot = -1;


// hash store
if&#40;slot < 0&#41; &#123; // we had miss&#58; replacement
  slot = &#40;key & 1&#41; + 1; // decide which slot we will test for being stale
  if&#40;!STALE&#40;test&#41;) &#123;     // replacing stale entries has priority
    slot = &#40;bucket->depth&#91;1&#93; > bucket->depth&#91;2&#93;) + 1; // 1 or 2
    int minDepth = bucket->depth&#91;slot&#93;;               // lowest depth
    if&#40;depth < minDepth&#41; slot = 0; // new entry even shallower&#58; use always-replace slot
  &#125;
&#125;

bucket->data&#91;slot&#93;.lock = key;
bucket->data&#91;slot&#93;.score = bestScore;
bucket->data&#91;slot&#93;.moveAndBoundType = bestMove |      // pack move and bound type
  &#40;bestScore > originalAlpha&#41;*H_LOWER + &#40;bestScore < beta&#41;*H_UPPER;
bucket->depth&#91;slot&#93; = depth;
bucket->lockExt&#91;slot&#93; = keyExt;

The first entry in the bucket (tested first on probing) will be the always-replace entry, and almost all hits will be on that one, so that the other two entries would not have to be examined at all. Hits on the deep entries will be rare (but when they occur, save us an enormous amount of work, so it is worth it to protect them). In case of a miss the 8-bit lockExt fields will establish the miss in 99.6% of the cases, so that the 32-bit lock of the depth-preferred entries will not have to be accessed.

Note that for testing for a hit in the always-replace entry, we test the 32-bit lock first, because it gives a larger chance to rule out the hit than the 8-bit lockExt of that entry (so we can skip the test on the latter). If the lock matches, the lockExt will almost always match as well, as the chances the lock matches by accident (i.e. key collision) will only be 1 in 4 billion.

flok · Post by **flok** » Tue Apr 18, 2017 3:42 pm

But what about that it can find it with one hash table but not with the other?

- 32MB ok
- 64MB fail
- 128MB fail
- 256MB fail
- 512MB ok
- 1024MB fail
- 2048MB fail
- 4096MB ok
- 8192MB fail

hgm · Post by **hgm** » Tue Apr 18, 2017 4:02 pm

Find what? It always finds the mate, right?

flok · Post by **flok** » Tue Apr 18, 2017 5:21 pm

Yes, but:

32 at 42
64 at 39
128 at 47
256 at 41
512 at 43
1024 at 43
2048 at 45
4096 at 42
8192 at 44

Shouldn't that be all 39?

hgm · Post by **hgm** » Tue Apr 18, 2017 6:06 pm

That depends on the search depth, and how much you prune. If you are sure that the mate is within the horizon, then it is indeed suspect that you don't find the fastest mate. If you are dependent on grafting of sub-trees through the hash table, then the hash size can matter. Although it is also suspect that it would matter for such a large table and such a small tree.

Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves

Re: Problems with TT, sometimes makes blunder moves