How could a compiler break the lockless hashing method?

Discussion of chess software programming and technical issues.

Moderator: Ras

syzygy
Posts: 5743
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

bob wrote:It is only fixed because of the way I wrote my code.
Yes and I happen to be able to read and to think, so I knew exactly what I had to change.
Just replace all occurrences of word1 with htable->word1 and you have not fixed anything at all.
Sure, or just insert an exit(0) and it will stop working!!
syzygy
Posts: 5743
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

bob wrote:The problem I keep coming back to is that they optimize as if no UB happens, but they do not know.
Exactly, they optimise as if no UB happens.
There is no need to detect UB in order to optimise as if no UB happens.
So what does it matter whether they know there is UB or not?

It matters in the sense that they can't warn you at compile-time that your buggy code might break. But you have been warned by us.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

rbarreira wrote:
bob wrote:
hgm wrote:
wgarvin wrote:They don't really want to break your code just to be malicious.
Then they must be incredibly incompetent. Because exiting with an error message when they detect a strcpy that would otherwise perfectly work, and for which they could see that it would perfectly work, does actually count as malicious sabotage. Like making the result of signed integer additions that can be seen at compile time to overflow anything else than the what the adder hardware does counts as malicious sabotage. Nothing in the standard forces them to do that; undefined behavior could mean anything. Also the obviously intended thing.

That they pester you with compile-time warnings when they detect such things, OK, I can live with that. But when I want to use it as a programmer, that should be my decision.
And the obvious follow-up. Once they have recognized overlapping buffers, why not fix it by directly calling memmove() rather than just aborting? Now everyone is happy, the UB is gone, yet nothing was broken or harmed.
On the other hand, the compiler made you fix your code, so you did get some benefit out of it. Sure, it could have done better by printing a better error message.
If the compiler had produced a better diagnostic, on the right output stream (stderr) the original bug report would have been very specific, and it would have taken no more than a couple of minutes to fix it. As it was, it took over a week, and a lot of testing before I even realized it was a Mavericks-only issue. That was my complaint. They could have actually fixed the problem themselves, within libc, by simply calling memmove() as opposed to abort(). I suppose that would be too easy and logical for some of these compiler/library folks to give it any thought, so "let's just break it and make 'em fix it, once they figure out what is broken."
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

wgarvin wrote:
hgm wrote:
wgarvin wrote:They don't really want to break your code just to be malicious.
Then they must be incredibly incompetent. Because exiting with an error message when they detect a strcpy that would otherwise perfectly work, and for which they could see that it would perfectly work, does actually count as malicious sabotage. Like making the result of signed integer additions that can be seen at compile time to overflow anything else than the what the adder hardware does counts as malicious sabotage. Nothing in the standard forces them to do that; undefined behavior could mean anything. Also the obviously intended thing.

That they pester you with compile-time warnings when they detect such things, OK, I can live with that. But when I want to use it as a programmer, that should be my decision.
I think you guys persistently assign the blame to the wrong source. You don't blame the programmers who wrote code that had undefined behavior and to which the language specs assign no semantics whatsoever. Instead, you blame the compilers? Just because something happens to compile and run on your current compiler does not make it a "working" program. Any program affected by the strcpy() change was by definition not a legal C or C++ program, at most it was a program in some similar-but-not-standardized language. You can say its a lousy state of affairs and I'll agree with you, but the fact of the matter is that all of those broken programs were already broken except for the complete fluke that the implementations they were previously compiled against happened not to trash their heap or overwrite their call stack or the like. Apple did those programmers a favor by pointing out their bug, and the change that made strcpy abort is just the last event in a series of events that made the program fail. Apple may have thrown the last pitch but the program was set up to fail by its lazy-or-harried-or-incompetent programmer. They didn't respect the API requirements of one of the most widely-known library functions on earth. Requirements that have existed in the exact same form for decades.

Those failed programs are the direct result of programmers not knowing or caring about undefined behavior. The glibc strcpy (like the memcpy before it) completely meets its obligations under the spec, and any programs aborting because of that change are doing that only because of their short-sighted and lazy programmers not knowing and following the API. If the users don't like it when their programs stop working, maybe they should give their custom to more reliable programmers in the future! :lol:

I mean, Bob can claim until he's blue in the face that the compiler and library vendors should protect him from the consequences of UB. But they're not going to do it, partly because its impossible in general and partly because its not actually their responsibility according to the spec. Its our job as programmers to write a safe and correct program--one that follows the rules of the language. Not some "common sense" version of the rules that a programmer has rattling around in their head, but the actual rules in the actual spec.
Problem is, they added a NEW manifestation of undefined behavior, that of simply aborting with no hint as to why. They slowed the code down so that they could check for overlap and abort, which costs performance. Why wouldn't they just fix the UB by ELIMINATING it... just call memmove() rather than abort() and that would be considered as a good fix, breaks nothing, fixes everything, who could complain? As it is, it is slower, and breaks on overlap with little warning.
syzygy
Posts: 5743
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

hgm wrote:
wgarvin wrote:They don't really want to break your code just to be malicious.
Then they must be incredibly incompetent. Because exiting with an error message when they detect a strcpy that would otherwise perfectly work, and for which they could see that it would perfectly work, does actually count as malicious sabotage. Like making the result of signed integer additions that can be seen at compile time to overflow anything else than the what the adder hardware does counts as malicious sabotage. Nothing in the standard forces them to do that; undefined behavior could mean anything. Also the obviously intended thing.

That they pester you with compile-time warnings when they detect such things, OK, I can live with that. But when I want to use it as a programmer, that should be my decision.
You want to use undefined behavior? You get undefined behavior...

Apple most likely did what it did to force people to fix bugs that might break programs later in a much more serious way, including security leaks.

Btw, your compiler does similar things:
http://talkchess.com/forum/viewtopic.php?t=47897
I think my conclusion in that thread was that you did not invoke UB, but I'm not really sure whether that is correct.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

wgarvin wrote:
hgm wrote:.. now some standards seem to be emerging that think they know better, basically renders what used to be a useful language into something completely useless. Basically these people are hijacking the language, to destroy it. They are nothing but terrorists!
....
More accurate would be to say that it was redefined to be no longer legal pseudo-C.
Its not like this is entirely a recent phenomenon.. I've pointed out repeatedly in these threads that the same exact kind of discussions were happening in comp.std.c back in 1992 when that "demons may fly out of your nose" phrase first caught on. The only significant change in the last 10 years is the compilers are a bit smarter now. They're still hopelessly dumb in many situations of course, and most undefined behavior will sail right by because they don't know how to recognize it (in some cases its very-difficult-to-impossible). But once in a while they might. (Nasal demons!) :lol:

Maybe its easier to not worry about it. If your faithful compiler (gcc or whichever) is doing OK by you, then keep using it. Just keep upgrading it until one day it does something completely reprehensible, then downgrade and stick with that one. That strategy might work well enough for years, and by then perhaps the situation will have improved and some new "safe dialect" stuff might be available: new compiler options to disable funky/dangerous stuff, better-specified refinements of the standards, etc.

I mean, a lot of programmers use all of these optimizing compilers daily and many correct or nearly-correct programs are written and compiled with them, and mostly we get by, right? You might end up with latent UB bugs but you could just as easily have other kinds of latent bugs too. Debugging tools exist (such as the sanitizer stuff clang and now gcc have been adding) that can help find some of the UB bugs, just like valgrind can help find memory access bugs. UB can occasionally have nasty results, but if it happens infrequently enough then maybe a rational strategy is to just carry on like always until it bites you. If it compiles and it works, the risk of current problems is minimal and its probably only future problems you might need to worry about.

The next time you catch a compiler generating something totally weird from your code, asking yourself "am I relying on undefined behavior in this code?" early in the debugging process might save a bit of wasted time.
That question is pointless. I had no idea I had any overlapping strcpy() calls in Crafty. That dates back so far I can't even remember writing the code. If you look at it, apparently I (or someone) fixed a couple of places where it was obvious in ReadPGN() but not the final one. Ditto for overflow. Who can overflow a 32 bit int at 3-4K nodes per second? Who would predict 10,000x speed improvement in 10 years?

The problem is not that it was done intentionally. It was done unintentionally in both places, and had I thought "do I have any overlapping strings in strcpy() calls I would have most likely said "no" and been willing to swear to that based on my memory.

I've been programming since 1968. I've used integer overflow so many times (ever seen a linear congruential PRNG as just one example) and they have worked flawlessly. Now some compiler guys have decided that since it is undefined, even though it is an established and working programming paradigm, they can break it in any way they choose; throw the code out, make it abort, write an error warning out, or make demons fly out of your nose. All with something that has been used for so long it is not funny, and something that would work PERFECTLY still if the compiler guys had just left it alone.

This sounds like a doctor that says, "hmm. splinter in finger, that's undefined in my medical books, so I'm removing your arm. or worse." When a band-aid and a little topical antibiotic would do just fine. Now I'm walking down the street thinking "holy crap, where did my arm go? Last time I looked I just had a splinter in one finger."

I STILL call that malicious compiler/library development.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

syzygy wrote:
bob wrote:It is only fixed because of the way I wrote my code.
Yes and I happen to be able to read and to think, so I knew exactly what I had to change.
Just replace all occurrences of word1 with htable->word1 and you have not fixed anything at all.
Sure, or just insert an exit(0) and it will stop working!!
That's a real intelligent form of discussion. Which simply shows it is pointless to try. Carry on...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

syzygy wrote:
bob wrote:The problem I keep coming back to is that they optimize as if no UB happens, but they do not know.
Exactly, they optimise as if no UB happens.
There is no need to detect UB in order to optimise as if no UB happens.
So what does it matter whether they know there is UB or not?

It matters in the sense that they can't warn you at compile-time that your buggy code might break. But you have been warned by us.
If they optimize as if no UB happens, they might well just delete that block of code. That breaks things unnecessarily. One CAN optimize without making lots of invalid assumptions. IE I can get you to the moon in 10 seconds. You might not survive the initial acceleration, but I assume you MIGHT. You only need to average about 24,000 miles per second. just under 1/7th the speed of light. To get to 60mph in 1 sec, you need a little over 2G's. Good luck surviving that trip...
syzygy
Posts: 5743
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

hgm wrote:Note that is is debatable what "the language" actually is. There used to be a time where the operator + working on ints did actually mean "perform an addition".
What kind of addition gives 20000 + 20000 = -25536 ? (Obviously a 16-bit example.)
That now some standards seem to be emerging that think they know better, basically renders what used to be a useful language into something completely useless. Basically these people are hijacking the language, to destroy it. They are nothing but terrorists!
I'm guessing that overflowing signed addition was already UB in K&R, but I don't have a copy to check. In any event, it has been UB since at least 25 years or so.
In the case of a compiler, I can use one that would allow me do non-atomic stores without using locks when I don't care about the undefinedness the hardware would produce from this, and add integers that I want to overflow.
You can do all of that with gcc, but you have to know what you are doing. You have to set a few options and you'll get what you want. You can implement lockfree synchronisation in gcc, but you have to be aware of the pitfalls and of some compiler-specific tricks to tell the optimiser not to be too smart where it matters. The resulting code will not be standard C though, so won't be very portable. It might also completely break with gcc on other architectures, just because other architectures have different memory models that you may not have thought of.
syzygy
Posts: 5743
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

bob wrote:
syzygy wrote:
bob wrote:The problem I keep coming back to is that they optimize as if no UB happens, but they do not know.
Exactly, they optimise as if no UB happens.
There is no need to detect UB in order to optimise as if no UB happens.
So what does it matter whether they know there is UB or not?

It matters in the sense that they can't warn you at compile-time that your buggy code might break. But you have been warned by us.
If they optimize as if no UB happens, they might well just delete that block of code. That breaks things unnecessarily. One CAN optimize without making lots of invalid assumptions. IE I can get you to the moon in 10 seconds. You might not survive the initial acceleration, but I assume you MIGHT. You only need to average about 24,000 miles per second. just under 1/7th the speed of light. To get to 60mph in 1 sec, you need a little over 2G's. Good luck surviving that trip...
Ok you're turning and twisting again, so I guess you have now gotten the point that they don't need to know that UB happens in order to optimise as if no UB happens? You deserve a round of applause.

I hope you now stop "keep coming back" to this same problem (but of course you will do just that or your name wouldn't be Bob).