I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.
How could a compiler break the lockless hashing method?
Moderator: Ras
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: How could a compiler break the lockless hashing method?
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: How could a compiler break the lockless hashing method?
Yeah that came out wrong. What I should have said is that the mere potential for a data race is not undefined. Only an actual race is.
-
- Posts: 931
- Joined: Tue Mar 09, 2010 3:46 pm
- Location: New York
- Full name: Álvaro Begué (RuyDos)
Re: How could a compiler break the lockless hashing method?
The draft of the C11 standard, under "5.1.2.4 Multi-threaded executions and data races", saysrbarreira wrote:I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.
25 The execution of a program contains a data race if it contains two conflicting actions in
different threads, at least one of which is not atomic, and neither happens before the
other. Any such data race results in undefined behavior.
-
- Posts: 900
- Joined: Tue Apr 27, 2010 3:48 pm
Re: How could a compiler break the lockless hashing method?
Yeah I was totally wrong, see my other reply.AlvaroBegue wrote:The draft of the C11 standard, under "5.1.2.4 Multi-threaded executions and data races", saysrbarreira wrote:I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.25 The execution of a program contains a data race if it contains two conflicting actions in
different threads, at least one of which is not atomic, and neither happens before the
other. Any such data race results in undefined behavior.
In light of that, I may have to backtrack on my claim that volatile completely fixes lockless hashing (if that's what I said). But it does prevent the xor-cancellation optimization.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How could a compiler break the lockless hashing method?
It is simple, requiring 1 cycle (xor is very fast). It prevents corruption when you have two different pieces of two different entries that appear to belong together but do not.Daniel Shawul wrote:I wonder what the innovation is with the lockless hashing scheme. Accidental data corruption, be it due to concurrent data modification as is the case here ,or data transfer through a noisy network etc, can be checked and handled via cyclic redundancy checks (CRC). Ofcourse this will not prevent deliberate attacks. But it seems to me the lockless hashing scheme is pretty much the same as a simple CRC check that XORs hash signatures of the data and appends the key at the end. This is especially obvious when the hash entry has more than 128 bits of data, that requires a a chain of XORing. So what is the difference?
It is not earth-shattering. It is just a simple solution to an aggravating problem.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How could a compiler break the lockless hashing method?
I think volatile will make it worse. The problem is the time span between fetching one of the words and the other. Each time you fetch the second again, the time span has increased, allowing another thread to store over that value. It doesn't cause any significant errors if that does happen, but lockless hashing was developed to eliminate most of those.hgm wrote:So what is the bottom line here? Is declaring the hash table entry as 'volatile' sufficient guarantee against the compiler subverting the code?
(Of course I would write the code as making a one-time copy of the entry, testing that one for validity, and using it when it passed:)Code: Select all
HashEntry entry = hashTable[index^i]; for(i=0; i<4; i++) { if((entry.signature ^ entry.data) == hashKey) break; } ... draft = entry.data.draft; ... // etc.
If the compiler is stupid, anything can happen. None I have checked in the past two days do anything other than what my code tells them to, load both words back-to-back and then not reload them from the shared hash table again, period.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How could a compiler break the lockless hashing method?
Someone posted an excerpt from a recent version of the standard that specifically called races "undefined behavior".rbarreira wrote:I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.
Which I find amusing because no compiler on the planet can even recognize a race condition at compile time...
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How could a compiler break the lockless hashing method?
The compiler can't do the xor-cancellation anyway. This is a global variable. The compiler can not see all the other places that access the variable. Removing an xor in one place would break all the others. The compiler can only do such optimizations if it can see the variable for its entire life-time, which in C it can't do due to separate files.rbarreira wrote:Yeah I was totally wrong, see my other reply.AlvaroBegue wrote:The draft of the C11 standard, under "5.1.2.4 Multi-threaded executions and data races", saysrbarreira wrote:I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.25 The execution of a program contains a data race if it contains two conflicting actions in
different threads, at least one of which is not atomic, and neither happens before the
other. Any such data race results in undefined behavior.
In light of that, I may have to backtrack on my claim that volatile completely fixes lockless hashing (if that's what I said). But it does prevent the xor-cancellation optimization.
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: How could a compiler break the lockless hashing method?
Not sure what you mean. I use it also for my pawn hash table which is far longer than 128 bytes per entry...Daniel Shawul wrote:I also belive that the lockless hashing scheme as described in the paper was meant for 128bit hash entry, not something claimed by mouth. If indeed the latter was true, the connection with CRC is unmissable. With two 64-bits it is easy to miss the connection, otherwise it is mandatory to have a reference to an exisitng CRC literature that it is virtually identical to. 'You can't have your cake and eat it too' describes what is going on here well.I used to think of it as something different until I needed it for bigger data structures in nebiyu
-
- Posts: 292
- Joined: Tue Jul 07, 2009 4:56 am
Re: How could a compiler break the lockless hashing method?
Well, what would you call them if not "undefined behavior"? The standard has to dictate what will happen, but the actions aren't ordered with respect to each other and we can't guarantee that the outcome will be the same as if one preceded the other, because it's not true.bob wrote:Someone posted an excerpt from a recent version of the standard that specifically called races "undefined behavior".rbarreira wrote:I don't think race conditions are undefined behavior in the C standard. They're just ignored by it, which is not the same thing.hgm wrote:Well, it was said somewhere that race conditions are mentioned in the standard as a form of undefined behavior, and multiple threads doing reads and writes to the same memory is definitely able to produce race conditions. So if the compiler would see 'volatile' as evidence that some other thread or hardware might be altering the values, and it sees non-atomic accesses, it cannot exclude race conditions would occur. And it could grab that as an excuse to completely delete the accesses from the program, or make a call to fdisk, or whatever.mvk wrote:I think you are confusing an unknown/undefined value with undefined behaviour. The latter is a technical term with a specific, well-defined, meaning. Reading a volatile variable isn't UB.
Which I find amusing because no compiler on the planet can even recognize a race condition at compile time...
It's not that the compiler detects undefined behavior and then goes off and does something insane on purpose. It's a warning that the results of unordered writes to memory by different threads aren't guaranteed to have any particular result and your ideas about what's correct or reasonable in those cases might not correspond to what actually happens.
Last edited by Aaron Becker on Tue Dec 10, 2013 8:38 pm, edited 1 time in total.