How could a compiler break the lockless hashing method?

Discussion of chess software programming and technical issues.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

Rein Halbersma wrote:
bob wrote:The optimization about a + 1 > a being optimized into "1" is an unsafe one.
No. The use of a + 1 > a for signed int a is unsafe. The optimization for sure will not help this single line very much, but the optimizer rules in a compiler are written for an entire universe of correct code, not for the hopefully soon extinct subset of non-conforming and unmaintained code.

The whole "the compiler is malicious" line of reasoning is based on a fundamental misunderstanding. It seems that you and HG regard C as a mapping of the hardware to C programming constructs, instead of the other way around. The reason that some program constructs in C lead to undefined behavior is that the mapping of C to hardware is not one-to-one. Some things cannot be defined in a portable way because of the enormous diversity in hardware (memory models, bus-width, cache coherence etc.).
Let me cross examine the witness. And I am going to "lead" the witness just a bit by asking very specific questions.

"are we compiling for a computer, yes or no?"

<I assume you answer yes>

"does the computer have a finite word length, which means that at some point, when you add 1 to a value, the value wraps around to a negative value for signed ints, just like it wraps around to zero for unsigned?"

<I assume you answer yes>

"why do we treat signed and unsigned overflow differently, since the hardware of today, any platform, treats them exactly the same?"

<you do get to answer this one>

"What is the difference between x++ producing a zero on unsigned ints, and x++ producing a large - number on signed ints?"

<your turn again>

"can you justify treating them differently?"

"does formal mathematics make such a distinction between the two cases?"

Then you will see where I am coming from here.

This difference between signed and unsigned is arbitrary. It serves no good purpose. It ADDS confusion to the programming process. And it is diametrically opposed to what the hardware does for both cases, which is to handle them exactly the same.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

hgm wrote:
Rein Halbersma wrote:The reason that some program constructs in C lead to undefined behavior is that the mapping of C to hardware is not one-to-one. Some things cannot be defined in a portable way because of the enormous diversity in hardware (memory models, bus-width, cache coherence etc.).
That's a stretch. What does the C compiler know about any of those? Absolutely nothing. What does the programmer know? whatever he knows. So why would the compiler inject some oddball optimization that breaks (somehow, although how I can't imagine) cache coherency on any machine available today?

The only difference the compiler sees from platform to platform is the instruction set. And it doesn't even see that until it gets ready to produce a machine-language translation and then optimize.

Yes, I understand that. And that is OK with me. But what is not OK with me is that that it should be allowed to use this as an excuse to dream up other responses than any existing or conceivable hardware would ever exhibit. "I could not know if you start counting the drawers of your filing cabinet at 0 or 1. So now you asked me to file something, I must put a bullet through your brain, and you have no right to complain about it, because you asked for it." I think anyone (or at least any judge) would agree that this in socially unacceptable behavior, and put anyone using that as defence away for murder, rather than for "assistance in a suicide". Well, this is exactly the same, so I don't see why it should be acceptable here. UB is supposed to stand for undefined behavior, not for undesired behavior!

I can see perfectly legitimate use for expressions like if(a+6 > a). Like testing if a is close enough to not overflowing for calculating what I want to calculate next, e.g. to prevent a sqrt() error on a negative operand. Assuming that it would be helpful in general to replace this by 6 > 0, and then 1, borders on moronic. If a programmer would have meant if(1), he would not have written the if() in the first place. It wrecks lots of useful applications of the construct, for no upside.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

rbarreira wrote:
bob wrote:
Sorry, but do you know x86 asm?

movw $20000, %ax
imulw 2
idivw 2

result = 20000

Exactly what I expected. No idea what machine you are running on, but on x86, multiply 2 16 bit ints, you get a 32 bit int; Then the divide hopefully reduces it back to a 16 bit int, otherwise the overflow flag is set. Either way, you get the right answer, or a wrapped answer, not a undefined answer.

I have no intention of writing x = a * b / c and hope that a*b wraps before the divide. I actually do normal math.

But if I do something like a++ and a wraps, I expect that if I do it enough times, or if a starts off large.

I DO expect that (a * b) / c == a * (b / c) which x86 is capable of doing quite correctly. I don't want the operator functionality changed. I just want the compiler to stay out of it.
Bob, I'm starting to wonder if your account has been taken over by someone. Now you're displaying ignorance about basic C and integer numbers in one single post.
Eh? What, exactly, does C know about integer numbers, other than how to translate C into machine language instructions that manipulate them as the operators demand? You DO realize that in the above case, the compiler does EXACTLY what it should. UNLESS you insert constants that will overflow the basic word size. Then it will completely whack the code for no good reason.

Is (a * b) / c REALLY any different, in any possible way, from (0x40000000 * 2) / 2? where a, b and c are 0x40000000, 2 and 2? If so, please explain exactly why/how they are different. Yet the compiler produces different results. For the first code, I get roughly 2.1 billion, the equivalent of 0x40000000, for the second I get roughly -1billion. (it internally replaces 0x40000000 * 0x40000000 by -2.1 billion, then divides by 2 and gets - 1.05 billion. Same code, different results. Bugus compilation IMHO. Constants and variables should be able to be used interchangeably according to every programming book I have ever used. You seem to think they are different, somehow.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: How could a compiler break the lockless hashing method?

Post by wgarvin »

rbarreira wrote:Bob, I'm starting to wonder if your account has been taken over by someone. Now you're displaying ignorance about basic C and integer numbers in one single post.
I know. Its like talking to a wall. He just thinks he knows better than everyone else, even when he's repeating the same arguments that have been refuted multiple times already in the same thread.

He gives examples from x86 code of "how compilers ought to do it" that haven't been current for almost 20 years, like his ridiculous imul / idiv example. Come on, bob! There's more than one form of imul instruction in x86 assembly, and there has been for a long time. You can use the one you quoted, but it has severe restrictions on which registers it can use. The one compilers usually use is the 32 * 32 -> 32 bit form, because it can be used with any of the general-purpose registers. And also because they know they only need 32 bits of result, since thats how the C language has been specified, since forever. Compilers also sometimes implement the multiply using shifts and adds, and they definitely implement divide by a constant using other instructions besides idiv, because idiv can be slow. All of this has been true for decades.

Bob's stubbornness in a debate is pretty amazing. I can only imagine the cognitive dissonance it must cause to argue, and believe, such a mix of contradictory and nonsensical positions.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

Henk wrote:I think that the described lockless hashing method won't scale well if the number of threads increases for the chance of an error per hash table entry increases.
Depends on what you mean by scale. It has been run for hours on a 64 cpu Itanium without a single reported error. I have not run on more than 64 cores however. In any case, the more cores you have, the greater the probability of a normal hash collision also. We are only using 64 bits for hash signature which is about 100 bits less than the best actual unique position encoding requires.
syzygy
Posts: 5713
Joined: Tue Feb 28, 2012 11:56 pm

Re: How could a compiler break the lockless hashing method?

Post by syzygy »

bob wrote:
syzygy wrote:You miss the point completely.

You have been arguing all the time: signed integer overflow should wrap.

So what happens if signed integer overflow wraps? Let's take 16-bit ints to keep the numbers smaller.
x = 20000
x * 2 = -25536
x * 2 / 2 = -12768

So if you want signed int overflow to wrap, then clearly x * 2 / 2 cannot be optimised to x.

If you want signed int overflow to wrap, then preventing overflow, as you are now suggesting the compiler should do, is simply very clearly dead wrong.
Sorry, but do you know x86 asm?

movw $20000, %ax
imulw 2
idivw 2

result = 20000

Exactly what I expected. No idea what machine you are running on, but on x86, multiply 2 16 bit ints, you get a 32 bit int; Then the divide hopefully reduces it back to a 16 bit int, otherwise the overflow flag is set. Either way, you get the right answer, or a wrapped answer, not a undefined answer.

I have no intention of writing x = a * b / c and hope that a*b wraps before the divide. I actually do normal math.

But if I do something like a++ and a wraps, I expect that if I do it enough times, or if a starts off large.

I DO expect that (a * b) / c == a * (b / c) which x86 is capable of doing quite correctly. I don't want the operator functionality changed. I just want the compiler to stay out of it.
Do you have any clue what you are asking from the compiler when you write x * 2 / 2?

If x is a signed int, and you want signed overflow to wrap, which so far you have always wanted, then what you are asking is to get -12768 when x = 20000 and ints are 16-bit.

If you do not understand this, you should not touch a compiler.
rbarreira
Posts: 900
Joined: Tue Apr 27, 2010 3:48 pm

Re: How could a compiler break the lockless hashing method?

Post by rbarreira »

bob wrote:
rbarreira wrote:
bob wrote:For example, the x * 2 / 2 example does NOT overflow on x86 if done right, and it produces X every time, exactly as it should, no clever optimizer needed. So why not do that.
Because that would mean changing the standard to f*** up the whole type system. If "x" is a 32-bit int then "x * 2" is also an int, not a 64-bit number.

On every machine I have programmed on in last 20 years, a 32 bit multiply produces a 64 bit result using two registers. You pick the machine. Vax. Sparc. X86. Etc. EVERYONE is aware of the potential problem dealing with the mathematical equality

(a * b) / c == a * (b / c)

so long as you ignore integer truncation.

Feel free to explain how the simple multiply fix everyone uses "f***'s up the whole type system". It has been used for 30 years or longer. The compiler does it RIGHT so long as it can't see the actual values to (apparently) say to itself "Aha, integer overflow is possible, I'm going to screw this up on purpose." If the compiler does the instructions normally, it works perfectly. You seem to think it fails most of the time. It does NOT.

In fact, in only fails when the compiler starts mucking around and "recognizes" an overflow that it can intentionally screw up.
I pick the machine? OK, how about I pick ARM CPUs which are extremely popular these days?

http://infocenter.arm.com/help/index.js ... IHGGJ.html

"The MUL instruction multiplies the values from Rm and Rs, and places the least significant 32 bits of the result in Rd."

Or, as Wylie said, how about the extremely common shr / shl / sar / sal (unsigned / signed shift) instructions on x86 which can also perform some multiplications and divisions?
mvk
Posts: 589
Joined: Tue Jun 04, 2013 10:15 pm

Re: How could a compiler break the lockless hashing method?

Post by mvk »

wgarvin wrote:I know. Its like talking to a wall. He just thinks he knows better than everyone else, even when he's repeating the same arguments that have been refuted multiple times already in the same thread.
These discussions always remind me of this.
"The Black Knight always triumphs!"
[Account deleted]
Rein Halbersma
Posts: 751
Joined: Tue May 22, 2007 11:13 am

Re: How could a compiler break the lockless hashing method?

Post by Rein Halbersma »

bob wrote:
This difference between signed and unsigned is arbitrary. It serves no good purpose. It ADDS confusion to the programming process. And it is diametrically opposed to what the hardware does for both cases, which is to handle them exactly the same.
It might be a better world if signed integer overflow was implementation-defined rather than undefined, you'll hear no complaints from me if that was the case. But it isn't. So given that state of affairs, can you answer me why

Code: Select all

if (a + 1 > a)
is any better than

Code: Select all

if (a > INT_MAX - 1)

The former is undefined, and the latter is implementation-defined. Given that piece of knowledge (free of charge), why not use that?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: How could a compiler break the lockless hashing method?

Post by bob »

rbarreira wrote:
bob wrote:
rbarreira wrote:
bob wrote:For example, the x * 2 / 2 example does NOT overflow on x86 if done right, and it produces X every time, exactly as it should, no clever optimizer needed. So why not do that.
Because that would mean changing the standard to f*** up the whole type system. If "x" is a 32-bit int then "x * 2" is also an int, not a 64-bit number.

On every machine I have programmed on in last 20 years, a 32 bit multiply produces a 64 bit result using two registers. You pick the machine. Vax. Sparc. X86. Etc. EVERYONE is aware of the potential problem dealing with the mathematical equality

(a * b) / c == a * (b / c)

so long as you ignore integer truncation.

Feel free to explain how the simple multiply fix everyone uses "f***'s up the whole type system". It has been used for 30 years or longer. The compiler does it RIGHT so long as it can't see the actual values to (apparently) say to itself "Aha, integer overflow is possible, I'm going to screw this up on purpose." If the compiler does the instructions normally, it works perfectly. You seem to think it fails most of the time. It does NOT.

In fact, in only fails when the compiler starts mucking around and "recognizes" an overflow that it can intentionally screw up.
I pick the machine? OK, how about I pick ARM CPUs which are extremely popular these days?

http://infocenter.arm.com/help/index.js ... IHGGJ.html

"The MUL instruction multiplies the values from Rm and Rs, and places the least significant 32 bits of the result in Rd."

Or, as Wylie said, how about the extremely common shr / shl / sar / sal (unsigned / signed shift) instructions on x86 which can also perform some multiplications and divisions?
I thought someone had already pointed out that shifting a negative value left is "undefined behavior"?

But if a shift is dangerous, don't do it. Do the imul which works correctly. Or, of course, you can do a shld. BTW what do you know about (say) SAL and SHL? :)

As to your question, if I do that on the arm, I will get a 32 bit value. So what? If I do it on Intel, I get the right value. How many ARMs are there compared to X86 in terms of numbers? Close enough to zero for a statistician to say "zero"? :)

You want to penalize me on x86 because the ARM is a poor processor (the double-register thing has been used since early 70's, it is not new). I prefer to get it right on hardware that can do so, and let the rest fend for themselves. You prefer to get it wrong on EVERYTHING. That reasoning simply escapes me. I can get it write way over 99% of the time, or I can get it wrong 100% of the time. Let's be consistent (and always get it wrong).