A note for C programmers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: A note for C programmers

Post by bob »

Rein Halbersma wrote:
bob wrote:
Rein Halbersma wrote:
bob wrote:Food for thought. The English language is well-defined.
Think again: http://english.stackexchange.com/questi ... nt-regions

Human languages are anything but well-defined, they are extremely context-sensitive (time, place, person).
I'll bet you won't find one native English-speaker that would say "undefined" and "unspecified" don't mean the same basic concept unless you want to get down to a minute semantic war based on what is meant by "is" for example.
Unfortunately, computer programs are not (yet) written in English. You write in C, and C happens to define those terms differently -and as Miguel epxlained, more precisely according to their etymological roots- than in colloquial speech. The Standard is what is relevant here, not your dictionary.
The man page is written in English. Not C. As is the C standard. Do you believe it is necessary for ALL programmers to read the complete C/C++ standard and understand it before they are qualified to write programs? That's an enormous stretch.

The fact that the C standard redefines standard English words to mean something somewhat different than their normal interpretation does not make it correct to do so.

I'm still reading all the bug reports from software that is broken on Mavericks. It has affected THOUSANDS of applications.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: A note for C programmers

Post by bob »

Rein Halbersma wrote:
bob wrote:
Rein Halbersma wrote:
bob wrote:Food for thought. The English language is well-defined.
Think again: http://english.stackexchange.com/questi ... nt-regions

Human languages are anything but well-defined, they are extremely context-sensitive (time, place, person).
BTW, here is another "undefined" activity. A race condition in a parallel program. Shouldn't happen, correct? But EVERYBODY is allowing race conditions when they store in their hash table. I suppose it would be OK for Apple to just crash any program where that happens? Even though we KNOW what we are doing?

That's pretty malicious compiler behavior, IMHO. Just because you should do something because it MIGHT cause a problem if you don't know what you are doing does not mean you shouldn't do it if you do know what to expect.
Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
Sorry, but wrong. There are only N possible behaviors. the xor trick works for ANY of them. Hence it will not fail, ever. Which was the intent. The undefined behavior is solely based on the order writes are done in by different cores. No order can break this and make it crash. No order can break this and cause a false hash match.

Won't ever be a crash report for this piece of code. Just because YOU can't interpret the possible outcomes doesn't mean I can't. Just because you can't deal with the different possible outcomes doesn't mean I can't.
AlvaroBegue
Posts: 931
Joined: Tue Mar 09, 2010 3:46 pm
Location: New York
Full name: Álvaro Begué (RuyDos)

Re: A note for C programmers

Post by AlvaroBegue »

bob wrote:The man page is written in English. Not C. As is the C standard. Do you believe it is necessary for ALL programmers to read the complete C/C++ standard and understand it before they are qualified to write programs? That's an enormous stretch.
Agreed. However, my man page on strcpy says this:
man -s 3 strcpy wrote: DESCRIPTION
The strcpy() function copies the string pointed to by src, including the terminating null byte ('\0'), to the buffer
pointed to by dest. The strings may not overlap, and the destination string dest must be large enough to receive the
copy.
There it is, written in plain English, without any redefinitions.
I'm still reading all the bug reports from software that is broken on Mavericks. It has affected THOUSANDS of applications.
Is anyone bitching and moaning as much as you are, or are they just admitting their code has a problem they need to fix?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: A note for C programmers

Post by bob »

AlvaroBegue wrote:
bob wrote:The man page is written in English. Not C. As is the C standard. Do you believe it is necessary for ALL programmers to read the complete C/C++ standard and understand it before they are qualified to write programs? That's an enormous stretch.
Agreed. However, my man page on strcpy says this:
man -s 3 strcpy wrote: DESCRIPTION
The strcpy() function copies the string pointed to by src, including the terminating null byte ('\0'), to the buffer
pointed to by dest. The strings may not overlap, and the destination string dest must be large enough to receive the
copy.
There it is, written in plain English, without any redefinitions.
I'm still reading all the bug reports from software that is broken on Mavericks. It has affected THOUSANDS of applications.
Is anyone bitching and moaning as much as you are, or are they just admitting their code has a problem they need to fix?
I'll leave that as an exercise to the reader + a google search. I don't think many are amused at something being capriciously broken by one specific vendor....

BTW I fixed the code when the problem was found, thank you. Doesn't make Apple's behavior reasonable, however. "Let's see how many software development projects we can screw up by taking a draconian stance on something that is anything but important. Let's not actually FIX strcpy() so that it works for overlapping source/destination strings, that would be too easy. Let's just break it and see who yells..."
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

bob wrote: The man page is written in English. Not C. As is the C standard. Do you believe it is necessary for ALL programmers to read the complete C/C++ standard and understand it before they are qualified to write programs? That's an enormous stretch.
No, I do not expect every X programmer to read the X Standard. However, I do expect professionals to at least be able to be able to read it and look up details when they are in doubt. For more casual programmers, the C-FAQ will do nicely. And there is always Stackoverflow for concrete questions.
petero2
Posts: 684
Joined: Mon Apr 19, 2010 7:07 pm
Location: Sweden
Full name: Peter Osterlund

Re: A note for C programmers

Post by petero2 »

Rein Halbersma wrote:
bob wrote:
Rein Halbersma wrote:
bob wrote:Food for thought. The English language is well-defined.
Think again: http://english.stackexchange.com/questi ... nt-regions

Human languages are anything but well-defined, they are extremely context-sensitive (time, place, person).
BTW, here is another "undefined" activity. A race condition in a parallel program. Shouldn't happen, correct? But EVERYBODY is allowing race conditions when they store in their hash table. I suppose it would be OK for Apple to just crash any program where that happens? Even though we KNOW what we are doing?

That's pretty malicious compiler behavior, IMHO. Just because you should do something because it MIGHT cause a problem if you don't know what you are doing does not mean you shouldn't do it if you do know what to expect.
Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
If you program in C++11, you can use std::atomic and std::memory_order_relaxed to avoid the undefined behavior without any runtime cost, at least on x86-64 hardware.

This will of course not avoid the non-determinism caused by potential memory reordering issues, but it does avoid the undefined behavior as defined in the C++11 standard.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

bob wrote:
Rein Halbersma wrote: Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
Sorry, but wrong. There are only N possible behaviors. the xor trick works for ANY of them. Hence it will not fail, ever. Which was the intent. The undefined behavior is solely based on the order writes are done in by different cores. No order can break this and make it crash. No order can break this and cause a false hash match.

Won't ever be a crash report for this piece of code. Just because YOU can't interpret the possible outcomes doesn't mean I can't. Just because you can't deal with the different possible outcomes doesn't mean I can't.
No, the current C11 and C++11 Standards both declare this as undefined behavior. Your machine might provide atomic instructions that will get the right result for all possible combinations of reads and write. But the compiler is not required to actually translate raw XOR operator to those instructions. It requires an atomic type or an explicit lock to avoid undefined behavior.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

petero2 wrote:
Rein Halbersma wrote:
Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
If you program in C++11, you can use std::atomic and std::memory_order_relaxed to avoid the undefined behavior without any runtime cost, at least on x86-64 hardware.

This will of course not avoid the non-determinism caused by potential memory reordering issues, but it does avoid the undefined behavior as defined in the C++11 standard.
Thanks for mentioning that. The C++11 Standard quote I wanted to paste was from 1.10/21
21 The execution of a program contains a data race if it contains two conflicting actions in different threads,
at least one of which is not atomic, and neither happens before the other. Any such data race results in
undefined behavior.
However, properly explaining what that means requires introducing the Standard's redefinition of the plain English term "happens before". (1.10 points 9 through 12).

In any case, neither happens before the other of course could cover different orders, and even if each order in itself would be OK, the undefined behavior gives the compiler leeway to do anything and does not force it to choose any of those orders. :twisted:
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: A note for C programmers

Post by bob »

Rein Halbersma wrote:
bob wrote:
Rein Halbersma wrote: Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
Sorry, but wrong. There are only N possible behaviors. the xor trick works for ANY of them. Hence it will not fail, ever. Which was the intent. The undefined behavior is solely based on the order writes are done in by different cores. No order can break this and make it crash. No order can break this and cause a false hash match.

Won't ever be a crash report for this piece of code. Just because YOU can't interpret the possible outcomes doesn't mean I can't. Just because you can't deal with the different possible outcomes doesn't mean I can't.
No, the current C11 and C++11 Standards both declare this as undefined behavior. Your machine might provide atomic instructions that will get the right result for all possible combinations of reads and write. But the compiler is not required to actually translate raw XOR operator to those instructions. It requires an atomic type or an explicit lock to avoid undefined behavior.
Please.

I use NO "atomic instructions" in the xor trick. I depend on NO specific hardware behavior. It can do in-order writes as the PC does, it can do out-of-order writes as the Dec Alpha does. NOTHING will break this unless the compiler guys suddenly decide that XOR means something beside "exclusive OR." I don't care whether you do the XOR with an XOR instruction, or you use adds and subtracts. Irrelevant. Same result. Works every time.

I am DETECTING the problem the undefined behavior MUST cause, on those rare occasions where it happens. I am not DEPENDING on any specific behavior whatsoever.

You should look at the lockless hashing paper. It explains it. No atomic locks, no tricks, works perfectly.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: A note for C programmers

Post by bob »

petero2 wrote:
Rein Halbersma wrote:
bob wrote:
Rein Halbersma wrote:
bob wrote:Food for thought. The English language is well-defined.
Think again: http://english.stackexchange.com/questi ... nt-regions

Human languages are anything but well-defined, they are extremely context-sensitive (time, place, person).
BTW, here is another "undefined" activity. A race condition in a parallel program. Shouldn't happen, correct? But EVERYBODY is allowing race conditions when they store in their hash table. I suppose it would be OK for Apple to just crash any program where that happens? Even though we KNOW what we are doing?

That's pretty malicious compiler behavior, IMHO. Just because you should do something because it MIGHT cause a problem if you don't know what you are doing does not mean you shouldn't do it if you do know what to expect.
Glad you brought that up. Yes, your classic hash XOR paper, for all its beauty, exploits undefined behavior. And no, you DON'T know what you are doing unless you write in assembly and bypass the C compiler for that piece of code. Expect a crash report any time soon.
If you program in C++11, you can use std::atomic and std::memory_order_relaxed to avoid the undefined behavior without any runtime cost, at least on x86-64 hardware.

This will of course not avoid the non-determinism caused by potential memory reordering issues, but it does avoid the undefined behavior as defined in the C++11 standard.
Does the C standard address SMP race conditions? I don't see why, as it is not a language issue. It is exactly the same for all languages, including ASM. If you use atomic locks, by the way, there is a HUGE runtime cost. Absolutely HUGE. The lockless hash approach avoids the cost entirely, at the very rare loss of a successful hash match when an entry is broken by the race.