A note for C programmers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: A note for C programmers

Post by rbarreira »

wgarvin wrote:Why do you say that one compiler "fails" and another one "works". The program invokes undefined behavior. The language standard defines no semantics for it at all. All of those compilers are working as intended.
It would be nice to see Bob acknowledging this. I certainly hope he's not telling his students that there's no such thing as undefined behavior.

I myself (in very rare cases) do rely on undefined behavior in C when it comes to implementing lockless algorithms, but I'm fully aware of the risk I'm taking and I wouldn't blame the compiler if things went south at some point. I also wouldn't use that in software that needs to be reliable and portable...
Last edited by rbarreira on Sat Dec 07, 2013 11:37 am, edited 1 time in total.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: A note for C programmers

Post by mar »

syzygy wrote: Some race conditions can be detected trivially. First of all, we need to be talking C11 or talking about threads does not even make sense. In C11, just let one thread create another thread and let both threads at some point access the same variable. If there are no synchronisation primitives on the paths towards these accesses, you have a data race. C11 defines the synchronisation primitives, so the compiler can recognise their absence.

Will compilers actively look for data races? Probably not. But they can assume that they won't happen and optimise your code in ways that you certainly are not expecting.
The problem is that the only way that might work for detecting races at compile time would be static analysis. But it's very expensive and unreliable.
I very much doubt any compiler will do that.
Another possibility would be to detect races at runtime and choose appropriate code paths: this would be too expensive and wouldn't work either.
What remains is hardware that would trap on races (maybe something like that already exists, I'm no hardware expert), but this has nothing to do with the compiler.
I think this undefined behavior made it into standard for a different reason than optimizations (is there a real world example where such optimizations work/help?).
I think it's because of focusing on future advancements in hardware and most importantly because of security, but I may be wrong.
To be honest, being 100% standard compiant will become black art,
existing very large code bases will have to be rewritten completely (which will cost time and money and introduce new bugs).
I'm not sure if this is of any good. We'll see.
syzygy
Posts: 5555
Joined: Tue Feb 28, 2012 11:56 pm

Re: A note for C programmers

Post by syzygy »

bob wrote:
syzygy wrote:So what is it now. Are uint64 writes still guaranteed to be atomic by the C standard?
Do I care? No. My code works whether it is bit-wise or qword-wise. Exactly As I have repeatedly stated. But, please cite ONE reasonable explanation for why a compiler would take a uint64_t variable and do anything other than a single move instruction to write to it.
You keep confusing things. What the standard guarantees is one thing. What one would naively expect knowing the architecture of a particular machine is another thing. What a particular compiler actually will do is yet something else.

On the x86 architecture, unlike what you seem to think, the 64-bit write will certainly not be atomic. On the x86-64 architecture a compiler will probably produce an atomic write, but it does not need to do. Certainly the standard does not require it, how could it if the standard does not require a 64-bit architecture!

Please cite ONE reasonable explanation for why a compiler would implement a strcpy() from right to left. But we know it happens...
And again, there is nothing in the C99 standard that prescibes that different threads see the same global variables. This is simple enough: the C99 standard is not even aware of threads.
Are you kidding now? Memory addresses ARE addressed in the C standard. Better check up on structures. I CAN define what appears where, and all threads are guaranteed to see exactly the same data since there is only one address space managed by the operating system, NOT the compiler.
That it works is thanks to a cooperation between the OS, the linker, the C library and the specific implementation of the C compiler. You are relying very heavily on things outside the C99 standard. Not a big deal, C99 just didn't address this area and there are other standards such as pthreads.

A C99 compiler is allowed to store global variables in scratchpad memory local to the processor. Even if you can take the address of these variables, that does not mean that other threads can access those locations. Not even if the OS has full support for threads. The compiler needs to cooperate.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

mar wrote: The problem is that the only way that might work for detecting races at compile time would be static analysis. But it's very expensive and unreliable.
I very much doubt any compiler will do that.
Such technology is becoming mainstream
http://clang.llvm.org/docs/ThreadSanitizer.html
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: A note for C programmers

Post by mar »

Rein Halbersma wrote:Such technology is becoming mainstream
http://clang.llvm.org/docs/ThreadSanitizer.html
While this is an interesting tool, I was talking about compile-time detection.
It's nice to have such tool.
It would also be nice to be able to enable notifications about optimizations due to undefined behavior (unless clang can already do that).
EDIT: it seems clang has -fcatch-undefined-behavior, I will try it to see if it works
Last edited by mar on Sat Dec 07, 2013 12:18 pm, edited 1 time in total.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

mar wrote: To be honest, being 100% standard compiant will become black art,
existing very large code bases will have to be rewritten completely (which will cost time and money and introduce new bugs).
I'm not sure if this is of any good. We'll see.
I don't buy that. If you stay away from the dusty corners of the language (both in C and C++) you can and usually do write compliant code. That means don't get involved in tricky bit-twiddling (shifting through sign-bits, signed overflow), manual pointer stuff (null derefence, bounds violations) or cute expressions (incrementing and assigning in one go to the same variable like i = i++; ). Could you show me an example of UB that should not already have been rewritten on grounds of readability alone?
syzygy
Posts: 5555
Joined: Tue Feb 28, 2012 11:56 pm

Re: A note for C programmers

Post by syzygy »

Only just now I became aware of this paralell thread, containing passages such as:
I clearly DO understand what "undefined behavior" means. I, unlike yourself, apparently, ALSO understand WHY the warning about overlapping source/destination is discouraged. I, unlike yourself, am perfectly capable of avoiding that particular pitfall, which lets me use strcpy() in a way that will absolutely NOT fail.
Ouch.
mar
Posts: 2554
Joined: Fri Nov 26, 2010 2:00 pm
Location: Czech Republic
Full name: Martin Sedlak

Re: A note for C programmers

Post by mar »

Rein Halbersma wrote:I don't buy that. If you stay away from the dusty corners of the language (both in C and C++) you can and usually do write compliant code. That means don't get involved in tricky bit-twiddling (shifting through sign-bits, signed overflow), manual pointer stuff (null derefence, bounds violations) or cute expressions (incrementing and assigning in one go to the same variable like i = i++; ). Could you show me an example of UB that should not already have been rewritten on grounds of readability alone?
Of course, what about data races? pthreads vs C11.
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: A note for C programmers

Post by rbarreira »

I clearly DO understand what "undefined behavior" means.
I, unlike yourself, am perfectly capable of avoiding that particular pitfall, which lets me use strcpy() in a way that will absolutely NOT fail.
It's sad that Bob does not see the contradiction in these two statements. If I were interviewing him for a programming job I would definitely not hire him after seeing his stubbornness and lack of understanding regarding undefined behavior.

Bob, it does not matter how YOU think an API should work internally. Maybe you think it should copy left to right, but that does not entitle you to make assumptions.
Rein Halbersma
Posts: 741
Joined: Tue May 22, 2007 11:13 am

Re: A note for C programmers

Post by Rein Halbersma »

mar wrote:
Rein Halbersma wrote:I don't buy that. If you stay away from the dusty corners of the language (both in C and C++) you can and usually do write compliant code. That means don't get involved in tricky bit-twiddling (shifting through sign-bits, signed overflow), manual pointer stuff (null derefence, bounds violations) or cute expressions (incrementing and assigning in one go to the same variable like i = i++; ). Could you show me an example of UB that should not already have been rewritten on grounds of readability alone?
Of course, what about data races? pthreads vs C11.
Don't know about C11, but in C++11, concurrent data structures should use a boost::shared_mutex<uint64_t> to implement shared hash tables. This allows multiple readers and a single writer to safely (i.e. without UB) access a global hash table from multiple threads. In this C++ book on concurrency, such an example is worked out in detail.