Not sure I understand what you mean--it seems like the opposite to me. The smaller the table is, the less bits you are using to index it, and the more likely the top bits are different. As an extreme example, a table with 1 entry will have virtually 100% collisions, and one with 2^64 should theoretically get exactly 0% (if your hash function is good enough). So this makes it more likely that if we're accessing the same index in the table (=same cache line) that it's the same position.Gian-Carlo Pascutto wrote:Only if you're using tiny tables because they're CPU-local...with a large table and hit rates over 90%, the typical case should be 2, no?Zach Wegner wrote: 1. We are looking up the same position that we stored last time. The other CPU that overwrote the entry read the entry that we stored, failed the hash key check, and then it stored the results for a different position once it is done. Technically the other processor could be storing the same position if we stored the entry after it probed, but this case will be pretty rare, and it hurts anyways. So the entry we needed got overwritten with a useless one.
2. We are looking up a different position. There are two sub-cases:
2a. The other CPU overwrote the entry with the position we want
2b. The other CPU overwrote the entry with a third (or possibly the first due to a race condition)
1 is obviously bad. 2b is bad too, but probably 2a happens more, and that is the only case that helps. I'd say that 1 is much more common than 2 though.
Just to make sure we're on the same page, I'll lay out exactly what's happening in each case:
Case 1:
- CPU0 probes pawn position P in the pawn table entry which maps to cache line L and misses. L is marked as exclusive.
- CPU0 evaluates the position and writes it back to memory. L is marked as modified.
- CPU1 probes pawn position P', which also maps to L, which is invalid in 1's cache. L marked as owned in 0 and shared in 1.
- CPU1 evaluates and stores the entry. L is set to modified in 1 and invalid in 0.
- CPU0 probes again with position P, L is set to owned in 1 and shared in 0.
Case 2a:
- CPU0 probes pawn position P' in the pawn table entry which maps to cache line L and misses. L is marked as exclusive.
- CPU0 evaluates the position and writes it back to memory. L is marked as modified.
- CPU1 probes pawn position P, which also maps to L. L marked as owned in 0 and shared in 1.
- CPU1 evaluates and stores the entry. L is set to modified in 1, and invalid in 0.
- CPU0 probes again but with position P, L is set to owned in 1 and shared in 0, gets a pawn table hit.
Case 2b:
- CPU0 probes pawn position P' in the pawn table entry which maps to cache line L and misses. L is marked as exclusive.
- CPU0 evaluates the position and writes it back to memory. L is marked as modified.
- CPU1 probes pawn position P'', which also maps to L. L marked as owned in 0 and shared in 1.
- CPU1 evaluates and stores the entry. L is set to modified in 1, and invalid in 0.
- CPU0 probes again but with position P, L is set to owned in 1 and modified in 0.
You or Bob might need to check the coherency states, I think that's all right though.