Onno Garms wrote:bob wrote:(1) validate the move, assuming that making an illegal move will corrupt your board or other data structures;
(2) use an atomic lock to make sure that the move part of an entry goes with the corresponding signature part;
(3) use the lockless approach which is very low cost.
Any of the above will work just fine, but (1) introduces overhead for the validation procedure, while 2 introduces tons of memory/cache issues as well as stalling processes when there really is no conflict. I just took the solution with the least impact on NPS since all work equally well.
Why does (2) stall processes when there is no conflict? Why does it introduce cache issues? Are you talking of a single lock for the whole trans map?
You can lock however you want. You certainly do not want a lock for each hash entry. The lock access is _terribly_ slow. I have (in the past) tried a single lock for the entire table, which begins to break down beyond 4 processors as the cache traffic becomes horrendous. I have used "block" locks so that there was a lock for each N hash entries (where N was reasonably large to reduce the number of locks to something managable). But again, the lock mechanism is a hardware nightmare and floods the cache controllers with cache-to-cache traffic out the wazoo.
You will have some conflicts for the same cache line coming from main memory when two (or more) threads access the same 64 bytes of memory, but for reads, you need no protection using the XOR trick, you never modify the hash entry at the lookup point (or you might rarely modify it to update the "age" field perhaps). But if you have a lock, you modify it _every_ time you access it, and there goes the cache coherency traffic again.
For storing, you have a cache coherency issue on the block that contains the modified entry, but there is little conflict on a single entry with a large entry. Some, but not a large amount. But a lock is way more expensive, and by definition has way more conflicts, hence way more coherency traffic.
Sure, (3) is the fastest. But it introduces errors on the hashfull-counter (which is required by the uci protocol). Also, locking is easier to understand.
(a) who the heck cares. Lie to the UCI protocol if you want. It could care less about that number. All it does is display the value. (b) why would it be wrong? I just run thru the hash checking the "age" field and count the number of entries where entry->age == current_age. Notice that I XOR to modify the signature that is stored, the data is never modified.
Btw., a purist might argue that you need to check the trans move anyway because of the possibility of two different positions with same hash. If you store all 64 bits of the hash key in the entry this might be negligible, if you store just 32 bit it seems important to me:
Sooner or later the hash table is full. So when I store an entry, I find the entry for some position in its place. There is a 1:2^32 chance that this position has the same 32-bit hash. Assuming 1 Mp/s, this problem can be expected every 1000 seconds (16:40 min). This must be corrected by the propability of hash hits (the entry really belongs to the same position), but the magnitude is not affected by that.
I have never tested, but because of above reasoning I decided to check the trans move.
I check the trans/ref best move as well. And when (if) it fails, I write out a log entry. On our cluster testing, after each test is done, I extract any ERROR messages in the logs before tossing them. The only ones I have seen in the last 30M games or so were related to an illegal move to ponder. Caused by a bad call to HashStore() after an EGTB probe would hit at ply=2, and I unintentionally tried to store a wrong best move that was invalid. Have not seen a single "ERROR: bad move from hash table" message since I cleared the ERROR messages about 30M games ago. Have not seen any ERROR messages for almost 4M games now since the EGTB issue was fixed.
In Crafty I store the entire 64 bit signature. A few years ago, several of us on r.g.c.c ran exhaustive tests with 64 bit hashes. I burned weeks of Cray time, where I stored the normal hash and a real FEN board description as well, and always matched the FEN if the sigs matched. I searched many trillions of nodes and never had one collision when playing real games.
For me. pawn hashing is the real issue because I store a bunch of data, and some of it can't be wrong or it will cause some infinite loops in the passed pawn scoring if a mask says there is a passed pawn on a file, but there is no pawn there at all. I could either validate the data as I use it, which would slow it down, or I could validate the data when I look it up, which would involve re-computing the data losing the hash benefit. The lockless hash trick is used on the pawn hash table in exactly the same way, to solve the same problem. Only difference is, it is critical for me on pawn hashing, not that important on regular hash.