bug in : hash = malloc(hash_size + 63) & ~63

Discussion of chess software programming and technical issues.

Moderator: Ras

Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

bug in : hash = malloc(hash_size + 63) & ~63

Post by Chan Rasjid »

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I retrieved the line by Bob Hyatt from an old post. I think it has a bug. If 'hash' from 1) is assigned to the transposition table, it means it overwrites memory beyond the original starting address from malloc().

I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.

Rasjid
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by bob »

Chan Rasjid wrote:

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I retrieved the line by Bob Hyatt from an old post. I think it has a bug. If 'hash' from 1) is assigned to the transposition table, it means it overwrites memory beyond the original starting address from malloc().

I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.

Rasjid
How would this fail? You malloc N+63 bytes, then force the rightmost 6 bits to zero with the and. Suppose the original hash address is aligned at address 1024 (x"00000400') You get hash = 0x400 + 0x3f, or 0x43f, then you and with 0xffffffc0, binary:

1111 1111 1100 0000
0000 0100 0011 1111

which is

0000 0100 0000 0000

which looks perfect (I did the above using rightmost 16 bits only.

Suppose you get hash = 0x418. Add the 3f to it and you get 0x457

1111 1111 1100 0000
0000 0100 0101 0111

which is
0000 0100 0100 0000

which looks right to me.

Note that when I did the original malloc (assume I asked for 64 bytes total) and got 400, I grabbed addresses 0x400 - 0x400 + 64 + 63 or 0x400 - 0x47f.

Any address between start and start+63 (I asked for 64 bytes) certainly fits in that range and gives 0x400 - 0x43f which is inside the 63-byte larger block.

For the second example, which returned address 0x418. I asked for 64 + 63 extra bytes, which is 0x7f. So I got addresses 0x418 - 0x497 and I am going to use addresses 0x440 - 0x47f.

I am not sure what you think is wrong, or what I am overlooking if something really is wrong...
abulmo
Posts: 151
Joined: Thu Nov 12, 2009 6:31 pm

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by abulmo »

Chan Rasjid wrote:

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.
Of course you can only free what malloc returns, without masking it.

Code: Select all

memory = malloc(hash_size + 63);
hash = memory & ~63;
...
free(memory); // ok
free(hash); // wrong
Richard
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by Chan Rasjid »

abulmo wrote:
Chan Rasjid wrote:

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.
Of course you can only free what malloc returns, without masking it.

Code: Select all

memory = malloc(hash_size + 63);
hash = memory & ~63;
...
free(memory); // ok
free(hash); // wrong
my free() is ok - free(memory), the original saved pointer returned from malloc().

I don't know much about hex/binary arithmetic or about address alignment. I only do what is recommended, try to align my TT start address to 64/128 bytes so that when I do multi-probe of hash at TT, TT+1,TT+2,TT+3, all 4 would be in the same cache-line.

I have to think a little about Bob's reply. I don't get it yet.

What I mean in my original post is this:
with your code above as example, the address of the block we can write to starts from 'memory'. But the address 'hash' is before the address 'memory' or hash < memory. So if we store a hash record with :
hash[0] = ...something;
that slot is outside (before) the block allocated by malloc(). This means memory corruption of some 64 bytes of data just before 'memory'.

I am not sure as I still am thinking about Bob's reply.

Rasjid
User avatar
hgm
Posts: 28386
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by hgm »

bob wrote:How would this fail?

....

Suppose you get hash = 0x418. Add the 3f to it and you get 0x457

1111 1111 1100 0000
0000 0100 0101 0111

which is
0000 0100 0100 0000

which looks right to me.
The _size_ looks right. But the start address, assigned to hash, will be 0x418 & ~63 = 0x400, which is _outside_ the allocated area (running from 0x418 to N+0x457).

So actually it is very wrong (probably overwriting some pointers used by malloc for later freeing the allocated area, in addition to other stuff before it).
Chan Rasjid
Posts: 588
Joined: Thu Mar 09, 2006 4:47 pm
Location: Singapore

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by Chan Rasjid »

hgm wrote:
bob wrote:How would this fail?

....

Suppose you get hash = 0x418. Add the 3f to it and you get 0x457

1111 1111 1100 0000
0000 0100 0101 0111

which is
0000 0100 0100 0000

which looks right to me.
The _size_ looks right. But the start address, assigned to hash, will be 0x418 & ~63 = 0x400, which is _outside_ the allocated area (running from 0x418 to N+0x457).

So actually it is very wrong (probably overwriting some pointers used by malloc for later freeing the allocated area, in addition to other stuff before it).
So if it is a bug, the hash implementation would corrupt about 64 bytes of memory space. Most chess programs would unlikely be hurt in anyway as the heap memory is only use by just one call to malloc() for the TT. The corrupted bytes are unlikely to have anything (???). But i am not sure about SMP programs.

Rasjid
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by bob »

How would that happen:

t = malloc(n + 63);

allocates n bytes.

I then add 63 to the start address and then AND off the rightmost 6 bits. Did you miss the fact that I malloc 63 more than I need, then add 63 to the address before scrubbing the rightmost 6 bits? If I start off with an address with the rightmost 6 bits zero, adding 63 and then scrubbing them does nothing. If any of the rightmost 6 bits are set, this rounds the address up to the next 64 byte boundary before scrubbing the rightmost 6 bits...

In looking at your comments, you said "0x418 & ~63, but that isn't what I do, it is 0x418 + 63, and then I and the total with ~63...
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by Sven »

Chan Rasjid wrote:
abulmo wrote:
Chan Rasjid wrote:

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.
Of course you can only free what malloc returns, without masking it.

Code: Select all

memory = malloc(hash_size + 63);
hash = memory & ~63;
...
free(memory); // ok
free(hash); // wrong
my free() is ok - free(memory), the original saved pointer returned from malloc().

I don't know much about hex/binary arithmetic or about address alignment. I only do what is recommended, try to align my TT start address to 64/128 bytes so that when I do multi-probe of hash at TT, TT+1,TT+2,TT+3, all 4 would be in the same cache-line.

I have to think a little about Bob's reply. I don't get it yet.

What I mean in my original post is this:
with your code above as example, the address of the block we can write to starts from 'memory'. But the address 'hash' is before the address 'memory' or hash < memory. So if we store a hash record with :
hash[0] = ...something;
that slot is outside (before) the block allocated by malloc(). This means memory corruption of some 64 bytes of data just before 'memory'.

I am not sure as I still am thinking about Bob's reply.

Rasjid
The Crafty 23.1 code shows how this works, look at utility.c and search for AlignedMalloc, for instance. Basically that translates into something like this, with "alignment=64":

Code: Select all

memory = size + 63;
aligned_memory = (memory + 63) & ~63;
// use 'aligned_memory'
// finally, free 'memory'
'memory' points to a block which is 63 bytes bigger than actually needed. 'aligned_memory' points to an address between 'memory' and 'memory + 63' since zeroing the lower 6 bits can at most reduce the address by 63. So there is no bug involved IMO.

Sven
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by bob »

Chan Rasjid wrote:
abulmo wrote:
Chan Rasjid wrote:

Code: Select all

// 1) bug
hash = malloc(hash_size + 63) & ~63;
// 2) should be -
hash = (malloc(hash_size + 64) & ~63) + 64;
I have some sort of segmentation fault, etc when the original pointer from malloc() was free()'ed - main() did not exit with a clear status 0; 2) solves the problem.
Of course you can only free what malloc returns, without masking it.

Code: Select all

memory = malloc(hash_size + 63);
hash = memory & ~63;
...
free(memory); // ok
free(hash); // wrong
my free() is ok - free(memory), the original saved pointer returned from malloc().

I don't know much about hex/binary arithmetic or about address alignment. I only do what is recommended, try to align my TT start address to 64/128 bytes so that when I do multi-probe of hash at TT, TT+1,TT+2,TT+3, all 4 would be in the same cache-line.

I have to think a little about Bob's reply. I don't get it yet.

What I mean in my original post is this:
with your code above as example, the address of the block we can write to starts from 'memory'. But the address 'hash' is before the address 'memory' or hash < memory. So if we store a hash record with :
hash[0] = ...something;
that slot is outside (before) the block allocated by malloc(). This means memory corruption of some 64 bytes of data just before 'memory'.

I am not sure as I still am thinking about Bob's reply.

Rasjid
I do not see how you can get an address "before" the malloc'ed memory.

Again,

Code: Select all

mem = malloc(N+63);
aligned_mem = (mem + 63) & ~63;
Unless I am overlooking something due to a "senior moment" I don't see how that can fail. My current code can malloc() and free() things over and over as you change sides, with no problems at all...

I'm going to write a bit of C code to check the original and aligned addresses... More in a bit...
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: bug in : hash = malloc(hash_size + 63) & ~63

Post by Sven »

Bob, your code is o.k., I think Chan may have missed that both the malloc() expression and the expression to get the aligned address contain a "+ 63". I derive this from his original post where he has combined both into one expression which is incorrect.

Sven