My "optimal in this respect" only referred to the strategy of sticking to the (cache-line-sized) bucket the initial hashing function maps you to, and replace what you consider least valuable there, rather than trying to look elsewhere to see if you can find something even less valuable to replace. There still can be a huge difference in preformance due to the criteria used to decide which entry is least valuable.
I don't think an age clock based on irreversible moves would be very useful. For one, the way you propose it doesn't catch every unreachable position: if the root has age clock N, moves that push one Pawn get age N+1, but moves that push another Pawn also get age N+1. If the engine decides to push the first Pawn, the root gets age N+1, and the sub-tree after pushing the other Pawn (now largely unreachable) still has the same age as the root, and is thus not recognized as obsolete. But even without this problem, not being able to distinguish positions that are no longer visited in practice (due to the preceding reversible moves have learned the search how to alpha-beta prune them would be a severe disadvantage.
I also doubt that neural nets can be of much use here. Because there isn't much you could feed to them to base their decision on. Unless you would store much more info in the hash entries, the only info you can go on is the depths and possibly ages. And that it is harmless to overwrite an entry when its age tells you the search no longer visits it seems a no-brainer.
Data structure choice for TT
Moderators: hgm, Rebel, chrisw
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
-
- Posts: 919
- Joined: Sat May 31, 2014 8:28 am
Re: Data structure choice for TT
I guess you would have to do some testing to determine what the penalty benefits ratio of increasing the bucket size to a multiple of the cache line size. So, I think it's premature to conclude that the cache line size is the optimal bucket size. It could be that the optimal size is larger than the cache-line size. This would depend on the CPUs cache architecture of course and the amount of benefit obtained with new replacement alogrithms.hgm wrote: ↑Fri Jul 31, 2020 4:08 pm My "optimal in this respect" only referred to the strategy of sticking to the (cache-line-sized) bucket the initial hashing function maps you to, and replace what you consider least valuable there, rather than trying to look elsewhere to see if you can find something even less valuable to replace. There still can be a huge difference in preformance due to the criteria used to decide which entry is least valuable.
If the age clock is used by it self I would agree, but there is other information in the TT entry that could be of help. Namely the part of the Zorbist key used to differentiate two entries that hash to the same TT bucket. Even if no special form of Zorbist key is used an NN could in theory be trained to use this information to distinguish between position that are still useful and those that aren't. While this couldn't be made to be 100% accurate it would likely be much better than what's commonly in use today. It's difficult to see, from a human prospective how this would work, but a NN could pick out the patterns that you would miss in your wildest imagining.hgm wrote: ↑Fri Jul 31, 2020 4:08 pm I don't think an age clock based on irreversible moves would be very useful. For one, the way you propose it doesn't catch every unreachable position: if the root has age clock N, moves that push one Pawn get age N+1, but moves that push another Pawn also get age N+1. If the engine decides to push the first Pawn, the root gets age N+1, and the sub-tree after pushing the other Pawn (now largely unreachable) still has the same age as the root, and is thus not recognized as obsolete. But even without this problem, not being able to distinguish positions that are no longer visited in practice (due to the preceding reversible moves have learned the search how to alpha-beta prune them would be a severe disadvantage.
If the Zorbist key were constructed so that the part stored in the TT entry contained, a pawn constellation key and a piece constellation key, either combined or separately, it becomes easier to see how this would work. Since this information needs to be stored some where in the key in any case, little if any loss of efficiency of the Zorbist key would result. I'm not sure this is necessary as I'm pretty sure a NN would find the patterns in any case, but it's easier to think about how it would work if we mentally separate these parts of the key.
I think you're wrong, I think there is already enough information in the TT to make this possible. You might shorten training times if the needed data were separated in the Zorbist key. As far as finding positions that are no longer useful even though a reversible move has been made would be left up to the NN. This is a simple enough problem I think it will find the patterns that everyone else has missed.hgm wrote: ↑Fri Jul 31, 2020 4:08 pm I also doubt that neural nets can be of much use here. Because there isn't much you could feed to them to base their decision on. Unless you would store much more info in the hash entries, the only info you can go on is the depths and possibly ages. And that it is harmless to overwrite an entry when its age tells you the search no longer visits it seems a no-brainer.
Regards,
Zenmastur
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Data structure choice for TT
As even with bucket sizes of 4 there would virtually always be something of very little value in the bucket, and a cache line can hold up to 7 entries if you economize on entry size, there seems to be no significant upside to using multiple cache lines, and there definitely is a down-side. So the burden of proof would be on the multi-line scheme. Another point to keep in mind is that modern CPUs often use automatic prefetching of sequential memory accesses: when you access two neighboring lines, it will attempt prefetching the third. Which backfires when you use two-line buckets.
Even if you coulrd reconstruct the complete position from the hash key, how could that possibly tell you how important the position is? Much more relevant is to which depth it was searched, how many nodes that took, how many nodes ago it was first encountered...
Even if you coulrd reconstruct the complete position from the hash key, how could that possibly tell you how important the position is? Much more relevant is to which depth it was searched, how many nodes that took, how many nodes ago it was first encountered...
-
- Posts: 919
- Joined: Sat May 31, 2014 8:28 am
Re: Data structure choice for TT
I guess this would be determined by the amount of pressure there is on the TT. If there is little or no TT pressure I agree there will be plenty of space in the TT. But this isn't the situation anyone cares about. When the TT is massively oversubscribed replacing the least valuable entry is much more important. Being able to determine that a vast number of entries are completely worthless, especially those associated with deep searches and regardless of the number of cut offs they caused in the past would seem to be of value.
Really?? Which top tier program uses a bucket size of 7? Or even 6? I don't know of any. Although I must admit I haven't looked. The last time I checked (it's been a while) SF used a bucket size of 3 with two buckets per cache line. I'm not sure this is still true as they recently changed their TT structure to accommodate much larger TT sizes, so I suppose its changed.hgm wrote:and a cache line can hold up to 7 entries if you economize on entry size,
The depth of search is completely useless information if the position can no longer occur. Further, if the aging method uses depth of search as a criteria for determining when to replace an entry this will simple clog the TT with entries that are no longer useful but haven't aged out yet due to their depth. I run into this problem all the time. If the search is continued long enough the search will become pathological due to huge number of useless entries that have been searched to great depth. Clearing the TT solves this problem, but it also loses all the info in the TT which defeats the purpose of a long search in the first place.hgm wrote:Even if you coulrd reconstruct the complete position from the hash key, how could that possibly tell you how important the position is? Much more relevant is to which depth it was searched, how many nodes that took, how many nodes ago it was first encountered...
[Moderation] The remainder of this message was lost due to moderator error.
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
-
- Posts: 919
- Joined: Sat May 31, 2014 8:28 am
Re: Data structure choice for TT
Hmmm....Zenmastur wrote: ↑Sat Aug 01, 2020 1:33 amSure, replacement only becomes an issue with severely overloaded tables. I could detect virtually no effect on time-to-depth until the number of nodes gets to be more than 10 times the table size. But the statement isn't really dependent on that, but more on the distribution of nodes in the tree. This is vastly dominated by low-depth nodes. And all competitive replacement schemes always store every node. So the lowest depth in the bucket will reflect the distribution of nodes in the tree, and thus be almost always be 0 or 1, not saving you very much search effort on a cutoff hit.hgm wrote: ↑Fri Jul 31, 2020 11:32 pmI guess this would be determined by the amount of pressure there is on the TT. If there is little or no TT pressure I agree there will be plenty of space in the TT. But this isn't the situation anyone cares about. When the TT is massively oversubscribed replacing the least valuable entry is much more important. Being able to determine that a vast number of entries are completely worthless, especially those associated with deep searches and regardless of the number of cut offs they caused in the past would seem to be of value.
And even in a table with 100K buckets (extremely small by today's memory standards) an entry stored in the always-replace slot of a bucket would sit there for 100K nodes on average before that bucket is visited for a probe on another position, as these probes should be distributed randomly over the table. That is independent of tree size. During that period it will be available for probing by transpositions, and will cause several hits due to transpositions with moves 2, 4 or 6 ply earlier, as long as the sub-tree from those positions is not yet larger than 100K unique nodes.
Well, two buckets of three is 6 per cache line, right? The fact that they split the line into two buckets, while there would have been no downside in terms of memory bandwidth to use a bucket of 6, should already tell you that bucket size is far from being a problem.hgm wrote: ↑Fri Jul 31, 2020 11:32 pmReally?? Which top tier program uses a bucket size of 7? Or even 6? I don't know of any. Although I must admit I haven't looked. The last time I checked (it's been a while) SF used a bucket size of 3 with two buckets per cache line. I'm not sure this is still true as they recently changed their TT structure to accommodate much larger TT sizes, so I suppose its changed.
BTW, I am not really convinced that 'top tier' programs use good TT replacement. They are often tested in a regime where the table is far from overloaded, so that having a good replacement scheme is not on their agenda. Even with simplistic schemes like shallowest-of-4 the search speed is only a very weak function of table size.
Sure. This is why it is also relevant to know how long ago it was last used.
You are making a flawed assumption here. "Using depth as a criterion for determining when to replace an entry" is not at all the same as mindlessly hanging on to the entries of the largest depth. Always replacing the highest depth would also be a decision that uses no other information than the depth. (Not a very smart one, of course...) To prevent the problem you describe (which is all too real) we have equi-distributed-depth replacement, or under-cut replacement.Further, if the aging method uses depth of search as a criteria for determining when to replace an entry this will simple clog the TT with entries that are no longer useful but haven't aged out yet due to their depth. I run into this problem all the time. If the search is continued long enough the search will become pathological due to huge number of useless entries that have been searched to great depth. Clearing the TT solves this problem, but it also loses all the info in the TT which defeats the purpose of a long search in the first place.
Why is this post under my name? I didn't make it!
And why do I appear to be quoting you when it's actually the other way around?
Are you hacking my account?
Only 2 defining forces have ever offered to die for you.....Jesus Christ and the American Soldier. One died for your soul, the other for your freedom.
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Data structure choice for TT
Oop, I goofed again. I am so sorry. I must have accidentally hit the Edit instead of the Quote button...
Unfortunately there is no way to recover the original posting, but since I quoted most of it in my reply, a large part of it can still be salvaged. I will try to repair as much as I can.
Unfortunately there is no way to recover the original posting, but since I quoted most of it in my reply, a large part of it can still be salvaged. I will try to repair as much as I can.
-
- Posts: 27809
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Data structure choice for TT
Sure, replacement only becomes an issue with severely overloaded tables. I could detect virtually no effect on time-to-depth until the number of nodes gets to be more than 10 times the table size. But the statement isn't really dependent on that, but more on the distribution of nodes in the tree. This is vastly dominated by low-depth nodes. And all competitive replacement schemes always store every node. So the lowest depth in the bucket will reflect the distribution of nodes in the tree, and thus be almost always be 0 or 1, not saving you very much search effort on a cutoff hit.Zenmastur wrote: ↑Sat Aug 01, 2020 1:33 amI guess this would be determined by the amount of pressure there is on the TT. If there is little or no TT pressure I agree there will be plenty of space in the TT. But this isn't the situation anyone cares about. When the TT is massively oversubscribed replacing the least valuable entry is much more important. Being able to determine that a vast number of entries are completely worthless, especially those associated with deep searches and regardless of the number of cut offs they caused in the past would seem to be of value.
And even in a table with 100K buckets (extremely small by today's memory standards) an entry stored in the always-replace slot of a bucket would sit there for 100K nodes on average before that bucket is visited for a probe on another position, as these probes should be distributed randomly over the table. That is independent of tree size. During that period it will be available for probing by transpositions, and will cause several hits due to transpositions with moves 2, 4 or 6 ply earlier, as long as the sub-tree from those positions is not yet larger than 100K unique nodes.
Well, two buckets of three is 6 per cache line, right? The fact that they split the line into two buckets, while there would have been no downside in terms of memory bandwidth to use a bucket of 6, should already tell you that bucket size is far from being a problem.Zenmastur wrote: ↑Fri Jul 31, 2020 11:32 pmReally?? Which top tier program uses a bucket size of 7? Or even 6? I don't know of any. Although I must admit I haven't looked. The last time I checked (it's been a while) SF used a bucket size of 3 with two buckets per cache line. I'm not sure this is still true as they recently changed their TT structure to accommodate much larger TT sizes, so I suppose its changed.
BTW, I am not really convinced that 'top tier' programs always use good TT replacement. They are often tested in a regime where the table is far from overloaded, so that having a good replacement scheme is not on their agenda. Even with simplistic schemes like shallowest-of-4 the search speed is only a very weak function of table size.
Sure. This is why I mentioned age as one of the other criteria.
You are making a flawed assumption here. "Using depth as a criterion for determining when to replace an entry" is not at all the same as mindlessly hanging on to the entries of the largest depth. Always replacing the highest depth in the bucket would also be a decision that uses no other information than the depth. (Not a very smart one, of course... But with completely different behavior.) To prevent the problem you describe (which is all too real) we have equi-distributed-depth replacement, or under-cut replacement.Further, if the aging method uses depth of search as a criteria for determining when to replace an entry this will simple clog the TT with entries that are no longer useful but haven't aged out yet due to their depth. I run into this problem all the time. If the search is continued long enough the search will become pathological due to huge number of useless entries that have been searched to great depth. Clearing the TT solves this problem, but it also loses all the info in the TT which defeats the purpose of a long search in the first place.
-
- Posts: 725
- Joined: Tue Dec 18, 2007 9:38 pm
- Location: Munich, Germany
- Full name: Dr. Oliver Brausch
Re: Data structure choice for TT
OliThink is using self defined hash tables and has no collisions at all. Moreover, it uses a own random generator within its code, so the results are independent of the machine.
It uses:
1) A predefined array with random values for 12 bits. It's important that those values are perfectly random.
Code: Select all
static u64 hashxor[4096];
for (i = 0; i < 4096; i++) hashxor[i] = _rand_64();
This hash value ist updated by:
Code: Select all
u64 hashb;
#define XORHASH(f, p, c) hashb ^= hashxor[(f) | (p) << 6 | (c) << 9]
With XOR you can use the same method for moving and unmoving a piece from the square "f" to square "t":
Code: Select all
void move(Move m, int c, int d) {
int f = FROM(m);
int t = TO(m);
int p = PIECE(m);
XORHASH(f, p, c);
XORHASH(t, p, c);
...
PS: The 64 bit random generator:
Code: Select all
typedef unsigned long long u64;
typedef unsigned long u32;
typedef int Move;
#define LOW16(x) ((x) & 0xFFFF)
#define LOW32(x) ((x) & 0xFFFFFFFF)
static u32 r_x = 30903, r_y = 30903, r_z = 30903, r_w = 30903, r_carry = 0;
u32 _rand_32() {
r_x = LOW32(r_x * 69069 + 1);
r_y ^= LOW32(r_y << 13);
r_y ^= LOW32(r_y >> 17);
r_y ^= LOW32(r_y << 5);
u32 t = LOW32((r_w << 1) + r_z + r_carry);
r_carry = (LOW32(r_z >> 2) + LOW32(r_w >> 3) + LOW32(r_carry >> 2)) >> 30;
r_z = r_w;
r_w = t;
return LOW32(r_x + r_y + r_w);
}
u64 _rand_64() { u64 c = _rand_32(); return _rand_32() | (c << 32); }
Code: Select all
u64 _rand_64() { return _rand_32() | (_rand_32(); << 32); }
-
- Posts: 266
- Joined: Fri Jul 10, 2015 9:23 pm
- Location: Russia
Re: Data structure choice for TT
Why this so complexity?OliverBr wrote: ↑Sun Aug 02, 2020 1:11 am PS: The 64 bit random generator:Code: Select all
typedef unsigned long long u64; typedef unsigned long u32; typedef int Move; #define LOW16(x) ((x) & 0xFFFF) #define LOW32(x) ((x) & 0xFFFFFFFF) static u32 r_x = 30903, r_y = 30903, r_z = 30903, r_w = 30903, r_carry = 0; u32 _rand_32() { r_x = LOW32(r_x * 69069 + 1); r_y ^= LOW32(r_y << 13); r_y ^= LOW32(r_y >> 17); r_y ^= LOW32(r_y << 5); u32 t = LOW32((r_w << 1) + r_z + r_carry); r_carry = (LOW32(r_z >> 2) + LOW32(r_w >> 3) + LOW32(r_carry >> 2)) >> 30; r_z = r_w; r_w = t; return LOW32(r_x + r_y + r_w); } u64 _rand_64() { u64 c = _rand_32(); return _rand_32() | (c << 32); }
P.S.
my engine initialize function hash keys:
Code: Select all
void initialize_hash_keys()
{
for(U64 m=6364136223846793005,s=1,n=1,i=768; i--; hashkeys[i]=n=n*m+s);
}
Eugene Kotlov
Hedgehog 2.1 64-bit coming soon...
Hedgehog 2.1 64-bit coming soon...
-
- Posts: 725
- Joined: Tue Dec 18, 2007 9:38 pm
- Location: Munich, Germany
- Full name: Dr. Oliver Brausch
Re: Data structure choice for TT
Because I thought this was a perfect random generator.
This would be very interesting, if it had the same level of randomness. Are you sure of it? Am I free to use it?P.S.
my engine initialize function hash keys:Code: Select all
void initialize_hash_keys() { for(U64 m=6364136223846793005,s=1,n=1,i=768; i--; hashkeys[i]=n=n*m+s); }
EDIT: I see this magic number in https://en.wikipedia.org/wiki/Linear_co ... _generator.