bob wrote:wgarvin wrote:rbarreira wrote:Bob, I'm starting to wonder if your account has been taken over by someone. Now you're displaying ignorance about basic C and integer numbers in one single post.
I know. Its like talking to a wall. He just thinks he knows better than everyone else, even when he's repeating the same arguments that have been refuted multiple times already in the same thread.
Sorry you feel that way. I simply have my opinion, developed over several years of compiler development. Answer me this:
If you run an ASM program on an original Intel pentium, and then you run the same code on the pentium PRO which supported OOE, and you get DIFFERENT answers, would you consider that acceptable? Why or why not?
I'd be disappointed, because one of the selling points Intel used to get people to buy Pentiums was its backward-compatibility with 8086,80286, 80386 and 80486 code. Nonetheless, there was the fdiv bug, leading to a limited recall of the affected parts. And there are minor differences and errata between processor generations, though Intel and AMD mostly did a great job keeping them backward-compatible.
bob wrote:
If you compile the same source with -O0 and then with -O3 and you get different answers, would you consider that acceptable? Why or why not?
Well if my source was a correct, standard-conforming program then it might indicate a compiler bug. But 99 times out of 100 when this happens, its because my code has a bug in it, and in that situation its completely acceptable. If I'm accessing out of bounds in an array, that's undefined behavior. If I've overflowed some pointer arithmetic, that's undefined behavior. If I've shifted an N-bit int by N or more bits, that's undefined behavior. I don't expect my compiler to magically translate my broken source code into working programs in such cases, so I'm not surprised when it fails to do so.
bob wrote:
Finally, do your two answers match? If not, why? If both are yes, we probably can't communicate at all. Semantics are semantics. I don't want the computer hardware changing them on the fly just so it can execute things faster. If both are "no" then we seem to be in perfect agreement.
So I answered Yes to the first q, and No to the second, as would any programmer who actually knows both x86 and C.
bob wrote:
Because today, OOE is guaranteed to produce the exact same results as if the program is executed in-order. But compilers provably do produce different results with -O0 and -O3 as already discussed. I consider that ridiculous. Even more, I consider it poor compiler development.
If your program gives different results when compiled at -O0 and -O3 then either it is not a correct program (which is extremely common) or it is depending on extensions to the language, extra semantics that compiler vendor has provided or an extra spec/set of library semantics such as pthreads (this is also extremely common). In any case, as I have said several times already, only the programmer has the power to produce correct programs. If you write an incorrect program there is no way your compiler can magically fix it for you. If you think you "know" the language better than the compiler does, and refuse to believe your program is broken, then neither I nor your compiler can help you.
bob wrote:
He gives examples from x86 code of "how compilers ought to do it" that haven't been current for almost 20 years, like his ridiculous imul / idiv example. Come on, bob! There's more than one form of imul instruction in x86 assembly, and there has been for a long time.
RTFM. Why do you think that give us the one operand imul instruction? Because of the possibility of overflow and they give us a solution to prevent it completely between a successive multiply/divide pair. No you don't have to use 'em. And if you don't, you will get the expected 32 bit wrap whether you do imul or mul.
Right. And your C compiler doesn't have to use 'em, because of the way the * operator is specified in C. And when it evaluates "X * 2 / 2" and that "expected 32 bit wrap" occurrs, the result is not the original value of X. But because of the way the language spec is written, compilers
are allowed to optimize it back down to "X", and they do.
And I recall either you or hgm objected that nobody would write "X * 2 / 2" on purpose anyway... but that point was also addressed by Chris Lattner in those blog posts I have quoted and linked here multiple times. Stupid expressions like that often result from language abstractions (macro expansion, inlining and with C++, templates). Also, give some thought to loop induction variables. They might not overflow at the same points as the original variables would, so the expression might have different values in those cases. But since any overflow of the original variables makes it into undefined behavior, the compiler can safely use loop induction variables in those cases.
I don't believe the compiler writers would bother to do this stuff if it didn't actually speed up some actual programs out there. Their job is to generate the best possible code while complying with the spec. Our job is to write programs, but if we want those programs to work reliably, we also have to comply with the spec.
bob wrote:
You can use the one you quoted, but it has severe restrictions on which registers it can use. The one compilers usually use is the 32 * 32 -> 32 bit form, because it can be used with any of the general-purpose registers. And also because they know they only need 32 bits of result, since thats how the C language has been specified, since forever. Compilers also sometimes implement the multiply using shifts and adds, and they definitely implement divide by a constant using other instructions besides idiv, because idiv can be slow. All of this has been true for decades.
You make my point. They don't do it the "right" way according to the hardware, they do it the way that is compatible with hardware that doesn't have that option. Even though most existing hardware today is capable of doing so. On most non-x86 boxes that have real registers, if you multiply x times an even-numbered register, you get a even+odd register pair for the product. If you multiply x times an odd-numbered register, you get a single register value that will overflow much sooner than the pair of registers.
But ignoring that, let's take a simple example:
x = 0x40000000 on a 32 bit box, but it is passed in to a function so that the compiler can't see the value.
x * 2 / 2 produces 0x40000000 as expected.
But if you use 0x40000000 * 2 / 2, broken constant folding turns that into 0xc0000000 which is not quite the same. Why does it do them differently? Shouldn't it at LEAST be consistent in mathematical operations. Does "+" imply some different operation when applied to constants vs variables??? If so, why?
Bob's stubbornness in a debate is pretty amazing. I can only imagine the cognitive dissonance it must cause to argue, and believe, such a mix of contradictory and nonsensical positions.
Feel free to cite my "nonsensical position". If you think consistency is nonsensical, I can see why we don't agree. But that doesn't mean my believe that consistency is reasonable to expect is a flawed concept.
I admit I was tempted to go through the whole thread and make a list of all the things that I considered nonsensical. I think it would be a long and tedious task, and of little value in the end.
I agree that consistency is a nice property where its possible to have it, and that compiler writers should maybe do more than they have been doing to try and protect us from the dark corners of undefined behavior in C and C++. But you also have to acknowledge that locking down some of those undefined behaviors will have some costs: performance will suffer a bit. I think there is growing recognition lately that other things besides performance are pretty important too (safety, reliability, ability to reason about the program's behavior by reading the source code, etc.) so hopefully over the next few years the overall situation will get better.