A note for C programmers

bob · Post by **bob** » Sat Dec 07, 2013 6:31 pm

Rebel wrote:Glad to be an ASM programmer

There is still undefined behavior.

bsf eax, ebx

If ebx != 0, eax = number of first 1 bit set.

if ebx == 0, eax = undefined, ZF set.

For the record, in ALL versions of intel/AMD cpus released to date, undefined == unchanged.

which made this old hack work:

mov eax,64
bsf eax,ebx

if no bits are set, the result is 64...

bob · Post by **bob** » Sat Dec 07, 2013 6:33 pm

rbarreira wrote:
wgarvin wrote:Why do you say that one compiler "fails" and another one "works". The program invokes undefined behavior. The language standard defines no semantics for it at all. All of those compilers are working as intended.
It would be nice to see Bob acknowledging this. I certainly hope he's not telling his students that there's no such thing as undefined behavior.

I myself (in very rare cases) do rely on undefined behavior in C when it comes to implementing lockless algorithms, but I'm fully aware of the risk I'm taking and I wouldn't blame the compiler if things went south at some point. I also wouldn't use that in software that needs to be reliable and portable...

Note I am NOT "blaming the compiler". I am most definitely "blaming the library maintainer". Exactly as Torvalds with the memcpy() debacle.

Changing something to fix a bug is good. Changing something to make it faster is good. Changing something just to break code is NOT good, and this is exactly what Apple did (and only Apple to date).

I'm not a proponent of undefined behavior, but I do consider myself quite capable of using it when it is safe, as in lockless hashing.

Nothing the compiler can do to break that.

bob · Post by **bob** » Sat Dec 07, 2013 6:36 pm

rbarreira wrote:
I clearly DO understand what "undefined behavior" means.

I, unlike yourself, am perfectly capable of avoiding that particular pitfall, which lets me use strcpy() in a way that will absolutely NOT fail.
It's sad that Bob does not see the contradiction in these two statements. If I were interviewing him for a programming job I would definitely not hire him after seeing his stubbornness and lack of understanding regarding undefined behavior.

Bob, it does not matter how YOU think an API should work internally. Maybe you think it should copy left to right, but that does not entitle you to make assumptions.

Still missing the point. It has copied left to right since Richie wrote the first version of A. It STILL copies left to right today. But apple inserted a check for overlap and aborts if detected. I don't see the point of THAT, which broke a TON of existing software packages...

Strings are a left-to-right entity since they are precisely defined as a series of bytes terminated by a NULL on the right-most end.

bob · Post by **bob** » Sat Dec 07, 2013 6:40 pm

syzygy wrote:
bob wrote:
syzygy wrote:So what is it now. Are uint64 writes still guaranteed to be atomic by the C standard?
Do I care? No. My code works whether it is bit-wise or qword-wise. Exactly As I have repeatedly stated. But, please cite ONE reasonable explanation for why a compiler would take a uint64_t variable and do anything other than a single move instruction to write to it.
You keep confusing things. What the standard guarantees is one thing. What one would naively expect knowing the architecture of a particular machine is another thing. What a particular compiler actually will do is yet something else.

On the x86 architecture, unlike what you seem to think, the 64-bit write will certainly not be atomic. On the x86-64 architecture a compiler will probably produce an atomic write, but it does not need to do. Certainly the standard does not require it, how could it if the standard does not require a 64-bit architecture!

Please cite ONE reasonable explanation for why a compiler would implement a strcpy() from right to left. But we know it happens...

And again, there is nothing in the C99 standard that prescibes that different threads see the same global variables. This is simple enough: the C99 standard is not even aware of threads.
Are you kidding now? Memory addresses ARE addressed in the C standard. Better check up on structures. I CAN define what appears where, and all threads are guaranteed to see exactly the same data since there is only one address space managed by the operating system, NOT the compiler.
That it works is thanks to a cooperation between the OS, the linker, the C library and the specific implementation of the C compiler. You are relying very heavily on things outside the C99 standard. Not a big deal, C99 just didn't address this area and there are other standards such as pthreads.

A C99 compiler is allowed to store global variables in scratchpad memory local to the processor. Even if you can take the address of these variables, that does not mean that other threads can access those locations. Not even if the OS has full support for threads. The compiler needs to cooperate.

You need to pull out an intel manual. If you write 8 bytes to a memory address that is a multiple of 8, it is GUARANTEED to be atomic by the architecture. Otherwise we can't even depend on exchange (xchg) to work correctly.

If a compiler does what you suggest, I/O is impossible. How can I write to a buffer, and signal another thread or process to write that out to disk, so that only one process does I/O? Yet the operating system, the libraries, and everyone else depends on that happening. There is a BIG difference between writing to a local variable, and writing to a global (shared) variable. The compiler is NOT free to optimize away stores/writes to global data.

simon · Post by **simon** » Sat Dec 07, 2013 7:07 pm

bob wrote:
rbarreira wrote:
wgarvin wrote:Why do you say that one compiler "fails" and another one "works". The program invokes undefined behavior. The language standard defines no semantics for it at all. All of those compilers are working as intended.
It would be nice to see Bob acknowledging this. I certainly hope he's not telling his students that there's no such thing as undefined behavior.

I myself (in very rare cases) do rely on undefined behavior in C when it comes to implementing lockless algorithms, but I'm fully aware of the risk I'm taking and I wouldn't blame the compiler if things went south at some point. I also wouldn't use that in software that needs to be reliable and portable...
Note I am NOT "blaming the compiler". I am most definitely "blaming the library maintainer". Exactly as Torvalds with the memcpy() debacle.

Changing something to fix a bug is good. Changing something to make it faster is good. Changing something just to break code is NOT good, and this is exactly what Apple did (and only Apple to date).

I'm not a proponent of undefined behavior, but I do consider myself quite capable of using it when it is safe, as in lockless hashing.

Nothing the compiler can do to break that.

Wylie Garvin's comment which you quote is from a discussion of integer overflow not strcpy.

bob · Post by **bob** » Sat Dec 07, 2013 7:22 pm

wgarvin wrote:
bob wrote:
syzygy wrote:
bob wrote:
Rein Halbersma wrote:
bob wrote:
Code: Select all
#include <stdio.h> 

int main&#40;) 
&#123; 
  int i, k = 0; 

  for &#40;i = 1; i > 0; i += i&#41; 
    k++; 
  printf&#40;"k = %d\n", k&#41;; 
  return 0; 
&#125;
Running gcc -O0 will produce 31 on architectures where an int is 32-bits. However, with gcc -O2 and higher, the compiler will recognize that "i += i" yields signed overflow UB. It will then eliminate the entire expression, and further optimize this code to an infinite loop. Such a scenario is also possible to happen with your XOR trick on multicore machines. Compiler routinely optimize away UB code instructions. You need extra compiler instructions or code modifcations to get the compiler to do what you intended.
Don't know what planet you compile on. Here's that run on my macbook:

scrappy% cc -O -o tst tst.c
.scrappy% ./tst
k = 31
scrappy% cc -O2 -o tst tst.c
scrappy% ./tst
k = 31
scrappy% cc -O3 -o tst tst.c
scrappy% ./tst
k = 31

gcc version 4.7.3 (MacPorts gcc47 4.7.3_3)
Online example using gcc -O3 (gcc 4.6.4 on Linux)
So? 4.7.3 newer than 4.6.4... perhaps they realized such an optimization is a mistake.
On my machine both 4.7.2 and 4.8.1 generate an infinite loop when compiling with -O3.
I gave the compile commands for 4.7.3 (macports) on my mac. And for the other versions on various departmental machines. One version of Intel C++ produced an infinite loop. Two others worked fine. So all I can confirm for failing is one specific version of intel C++. There may be several that fail. But there are some that work. I seem to be lucky enough to have more of those that work.
Why do you say that one compiler "fails" and another one "works". The program invokes undefined behavior. The language standard defines no semantics for it at all. All of those compilers are working as intended.

You seem to think you are "lucky" because you have a compiler that produces code from this snippet that matches your intuition about what the hardware would do if the compiler generated the code you imagine it would generate. But none of these compilers are obligated to generate that code (not even the ones that apparently do anyway). A future version of them might not generate that code. They might not generate that code next week, or when you compile a version of the function that has something else next to this snippet (e.g. different control flow, or whatever). In short, there's precious little you can rely on once you invoke undefined behavior, and the way these compilers treat this one code snippet doesn't really give you any guarantees about how they will treat other pieces of code. For that you have to rely on compiler-specific options, such as -fwrapv.

(And yes, I love being able to disable strict aliasing on all of the compilers I use... I hope that all major compilers continue to support that effectively forever, because there are billions of lines of code out there that don't respect the rule, and millions of programmers who don't know it well enough to follow it 100% of the time, and I am probably one of them).

Since one of the ongoing quibbles here is about the definition of "undefined behavior", here is the usual definition. One that has been used since the 1950's in fact. "Behavior that can not be predicted." This has ALWAYS been an issue of unexpected results caused, NOT by the compiler intentionally doing something off-the-wall, but by the code the compiler produces, which might not always do the expected thing.

The strcpy() function is an example. Since the first version of strcpy() it has copied left to right, because strings are terminated on the right with a NULL. The behavior has been undefined when the source/dest overlap because it is impossible to predict exactly what will happen if this is done inappropriately. For example, on a Cray, one can't copy a byte. Memory is 64-bit word addressable, only. If you want to implement strcpy() as a left-to-right byte copy, you can. You read a word, mask out one byte. Insert from source. write word back to memory. Read same word again, mask out next byte, insert, rewrite. 8 writes per word. Or you can do as the Cray Library did, and move 8 byte chunks left to right, but carefully checking for the zero terminator each time. This behaves differently than a byte by byte copy, but if you overlap the right direction, both work, if you overlap the wrong direction, both fail. But they fail differently. If someone wants to safely rely on that specific undefined behavior, look at the lib source. See what it does. If it fits your requirements and will produce what you expect, fine. It won't be portable. But it can be safely used. This "new definition" is nonsense. That a compiler can do ANYTHING it wants. Try to execute it, throw it out, generate a random number or whatever else it chooses. I do not consider that acceptable compiler behavior. Yes, it might discourage the use of undefined behavior. So will a shot to the head discourage speeding. Might be a bit draconian however.

This implies that a compiler writer might change this just for the hell of it. Ask yourself a real question, "Have I EVER changed something just to change it? Or did I change it to make it faster, more accurate, easier to read, more portable, or some other actual reason for changing the thing?" I know what my answer is.

Now let's take strcpy(). overlapping strings in one direction is clearly not going to work. Well-documented for 40+ years now. The other direction has also been well-documented to work for 40+ years. One might imagine that quite a few folks either intentionally or unintentionally used the "good direction" and got away with it. Anyone using the "bad direction" will have frequent failures and have to fix it.

Suddenly Apple decides to become the "undefined operation police" and they decide to focus specifically on strcpy(). The decide to abort the code whenever an overlap occurs, whether it is in the good direction or the bad direction.

What is the gain?

New programmers will be forced to not have overlapping strings, whether the strcpy() would actually work correctly or not. OK, not a bad result since there is inherent danger, particularly if one is not careful.

What is the loss?

Thousands of software packages suddenly fail. Software that has been purchased. Software than has been locally written. Software that has been downloaded. Tens of thousands of hours of development time are spent finding this unexpected and unexplained failure.

which is more important? I don't think you break working code to prevent future misuse.

One could just fix strcpy() so that it works for overlapping strings. I mean, the standard does NOT dictate that the function must use an implementation that is undefined for overlapping strings. This way you get your cake AND eat it too. You've eliminated the bug in future programs. You have fixed all existing software that is recompiled (or it might just link to a shared library containing the new version depending on the system). You have broken absolutely nothing.

So how is their actual action better than just fixing the thing? Overhead, you say? Remember, there is overhead to detect the strcpy overlap in the first place since you have to find the length of the string to see if there is an overlap.

IMHO this was done poorly, when it could have been done in a way that breaks nothing and fixes any possible use, AND it would eliminate that silly "undefined behavior" crutch for that specific function. They took the worst possible action, when even leaving it alone was better and fixing it was best.

I've written my share of compilers over the years, starting with one for Basic in the 70's and it was NEVER my intent to break something to teach those damned presumptive programmers a lesson. My intent was always to make it is reliable and correct as humanly possible, and when I could not be certain, I would at least try to do the right thing if the hardware was capable (int overflow, which all modern hardware handles just fine, but which some C compilers seem to think they know better and decide to just throw it out, breaking things instead of doing the right thing and letting the hardware deal with it, which it can with perfect consistency...

wgarvin · Post by **wgarvin** » Sat Dec 07, 2013 7:45 pm

simon wrote:Wylie Garvin's comment which you quote is from a discussion of integer overflow not strcpy.

We've bounced back and forth between a couple of different issues in this thread; what they all have in common is the "this is undefined behavior" and "compiler is allowed to make demons fly out of your nose if you do that" angle.

FWIW, I read through all of the comments on the original bugzilla report in Fedora (638477) and the upstream glibc bug report (12518) (This was memcpy saga). I'll try to briefly summarize it here. I'm not a Linux user and not too familiar with the bug-resolving processes of these projects, and the bugs had some political overtones, so I probably missed some nuances.

In September 2010 some Fedora 14 users started reporting that sound in Flash and at least one other mp3-playing app (gstreamer?) was garbled. Investigation showed that the unsupported 64-bit binary Flash plugin from Adobe contained memcpy calls with overlapping source and dest range, i.e. undefined behavior. It showed up because some Intel engineers had contributed new code to glibc to optimize memcpy and memmove to copy backwards in some cases, making them faster for "small" x86 chips (Atom, etc.) In some cases the new versions were 2x to 3x faster. Glibc 2.13 shipped with this new code, which uses some runtime info (e.g. CPU feature bits, but also other transient info such as the alignment of the source and dest addresses) to choose the fastest way of copying any particular block. However, it wasn't a major version bump so existing binary apps that were linking with the glibc shared library now started showing odd behaviors they hadn't shown before. The bugs are in those binary apps, not glibc (the apps invoke undefined behavior by passing overlapping buffers to memcpy). Nonetheless, Linus argued that breaking binary compatibility in the shared library was a bad thing to do.

During the end of 2010 up to perhaps february 2011, some drama played out. Irate users complained, Fedora suggested they direct the complaints to Adobe and the other authors of the broken programs. Fedora said "this is not our problem, those programs are buggy, complain to their authors." Linus argued that the change hurt users, and made a sort-of-convincing argument that the memcpy implementation could be changed to be identical to memmove without decreasing its performance in any notable way. The Fedora maintainers argued that working around bugs in 3rd-party proprietary apps was not something they cared about. Then it was argued that a lot of other apps, even nice open-source apps that were part of the core Fedora package, might be broken in silent ways by this memcpy change. And even that there might be security vulnerabilities existing purely because of this undefined behavior case. Some counter-arguments were also made (such as this one by Siarhei Siamashka). The main Fedora response became "complain to glibc then, it was their change, whatever they decide to do upstream is fine with us". In February 2011, Linus reported it as a bug against glibc and offered them a patch to turn their memcpy into something equivalent to memmove. Adding the test-and-abort stuff was also discussed. The glibc guys then implemented a patch and resolved their bug as "fixed". I'm not totally clear what their patch did, it sounds like it arranged things so that when existing binaries linked to the shared object, their memcpy symbol was redirected to memmove. But new binaries compiled from source, get memcpy when they ask for it (the new mega-featured one).

Anyway, all of this played out more than 2 years ago. Its surprising to see the same kind of thing happening again with strcpy.

Rein Halbersma · Post by **Rein Halbersma** » Sat Dec 07, 2013 8:01 pm

bob wrote: Since one of the ongoing quibbles here is about the definition of "undefined behavior", here is the usual definition. One that has been used since the 1950's in fact. "Behavior that can not be predicted." This has ALWAYS been an issue of unexpected results caused, NOT by the compiler intentionally doing something off-the-wall, but by the code the compiler produces, which might not always do the expected thing.

You are the only one quibbling with the Standard's definition of UB. Everyone else is on the same page here. Don't pull in whatever colloquial speech you are using into a precise and exact definition used by every compiler writer.
UB is about the freedom of a compiler to translate your C code to machine instructions.

bob · Post by **bob** » Sat Dec 07, 2013 8:06 pm

simon wrote:
bob wrote:
rbarreira wrote:
wgarvin wrote:Why do you say that one compiler "fails" and another one "works". The program invokes undefined behavior. The language standard defines no semantics for it at all. All of those compilers are working as intended.
It would be nice to see Bob acknowledging this. I certainly hope he's not telling his students that there's no such thing as undefined behavior.

I myself (in very rare cases) do rely on undefined behavior in C when it comes to implementing lockless algorithms, but I'm fully aware of the risk I'm taking and I wouldn't blame the compiler if things went south at some point. I also wouldn't use that in software that needs to be reliable and portable...
Note I am NOT "blaming the compiler". I am most definitely "blaming the library maintainer". Exactly as Torvalds with the memcpy() debacle.

Changing something to fix a bug is good. Changing something to make it faster is good. Changing something just to break code is NOT good, and this is exactly what Apple did (and only Apple to date).

I'm not a proponent of undefined behavior, but I do consider myself quite capable of using it when it is safe, as in lockless hashing.

Nothing the compiler can do to break that.
Wylie Garvin's comment which you quote is from a discussion of integer overflow not strcpy.

The THREAD is about strcpy(). There are multiple flavors of "undefined behavior" mentioned herein. From overlapping operands in strcpy, to integer overflow, to race conditions. My comments apply to all equally.

bob · Post by **bob** » Sat Dec 07, 2013 8:18 pm

Rein Halbersma wrote:
bob wrote: Since one of the ongoing quibbles here is about the definition of "undefined behavior", here is the usual definition. One that has been used since the 1950's in fact. "Behavior that can not be predicted." This has ALWAYS been an issue of unexpected results caused, NOT by the compiler intentionally doing something off-the-wall, but by the code the compiler produces, which might not always do the expected thing.
You are the only one quibbling with the Standard's definition of UB. Everyone else is on the same page here. Don't pull in whatever colloquial speech you are using into a precise and exact definition used by every compiler writer.
UB is about the freedom of a compiler to translate your C code to machine instructions.

The standards does not adequately define "undefined behavior" To wit:

3.4.3
1 undefined behavior
behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements
NOTE Possible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message), to terminating a translation or execution (with the issuance of a diagnostic message).
EXAMPLE An example of undefined behavior is the behavior on integer overflow.

So it can ignore the situation, to behaving in a documented way (both of which make sense) to terminating the compilation with a message. No mention of demons flying out one's nose or anything else.

So how about we get back to planet earth.

In reality, using strcpy() with overlapping operands is not undefined, because I can take the source and enumerate the possible outcomes for any pair of arguments. That's as it should be. Adding outcomes not possible from the actual code is senseless...

For example, in the 2005 draft of the C standard, there is no mention of a "race condition". Yet one can easily infer that the result is undefined because the "winner" of the race is not predictable.

A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers

Re: A note for C programmers