cache alignment of tt

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: cache alignment of tt

Post by Rémi Coulom »

bob wrote:
Cardoso wrote:Hi Bob, sorry, to go back to the AlignedMalloc.
As you might remember I posted here a question about your AlignedMalloc not working on Windows 7, x64 using the MS visual C++ 2010.
I made a compile of crafty 23.4 and couldn't allocante more than 1Gb of hash.
I wonder if the bug is on the "(long)" casting in your code.
Shouldn't in x64 Windows be an _int64?

best regards,
Alvaro
I believe you are correct. I think that "long" on 64 bit windows is STILL 32 bits for reasons I can't fathom...

I will fix this in 23.5 since I use the int64_t type anyway... Will break things for 32 bit machines it would appear, but those machines are pretty much going away anyway...
I am not sure exactly what this discussion is about, but it seems to me that using size_t would be the proper way.

Rémi
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: cache alignment of tt

Post by bob »

Rémi Coulom wrote:
bob wrote:
Cardoso wrote:Hi Bob, sorry, to go back to the AlignedMalloc.
As you might remember I posted here a question about your AlignedMalloc not working on Windows 7, x64 using the MS visual C++ 2010.
I made a compile of crafty 23.4 and couldn't allocante more than 1Gb of hash.
I wonder if the bug is on the "(long)" casting in your code.
Shouldn't in x64 Windows be an _int64?

best regards,
Alvaro
I believe you are correct. I think that "long" on 64 bit windows is STILL 32 bits for reasons I can't fathom...

I will fix this in 23.5 since I use the int64_t type anyway... Will break things for 32 bit machines it would appear, but those machines are pretty much going away anyway...
I am not sure exactly what this discussion is about, but it seems to me that using size_t would be the proper way.

Rémi
I took a quick look at stdint.h, and it looks like "uintptr_t" is the correct thing to portably declare a pointer. I saw examples of 32 bit processors with > 32 bit address spaces, and even a 64 bit processor with a < 32 bit address space (Cray 1 series through the T90 in fact).

size_t isn't guaranteed to be as big as a pointer, just big enough to hold the largest sizeof() value that can be returned... There is also a ptrdiff.t that might work as well as it is guaranteed to be able to hold the difference between any two pointers, but it is signed and might cause an issue...

I tried uintptr_t and it worked on my linux box...

Actually after looking at this, I am not sure what I am currently doing is actually correct for all platforms. Here's my "AlignedMalloc()" function:

void AlignedMalloc(void **pointer, int alignment, size_t size) {
segments[nsegments][0] = malloc(size + alignment - 1);
segments[nsegments][1] =
(void *) (((uintptr_t) segments[nsegments][0] + alignment -
1) & ~(alignment - 1));
*pointer = segments[nsegments][1];
nsegments++;
}

You pass it 3 values. A pointer to a pointer where it should store the address of the malloc'ed memory, an alignment value (64 is currently used, and a size (how much memory to malloc).

I am not sure size_t is correct for the third argument after reading a bit. It almost seems that this should be uintptr_t as well, or at least ptrdiff_t. But a negative value makes no sense...

Will have to do some research on this (again)... but for the moment, the above works on a 12 gig 64 bit processor and I tested with hash=8192M with no problems...
Last edited by bob on Thu Mar 15, 2012 6:08 pm, edited 1 time in total.
Daniel Shawul
Posts: 4185
Joined: Tue Mar 14, 2006 11:34 am
Location: Ethiopia

Re: cache alignment of tt

Post by Daniel Shawul »

Yes uintptr_t is what I used too.
diep
Posts: 1822
Joined: Thu Mar 09, 2006 11:54 pm
Location: The Netherlands

Re: cache alignment of tt

Post by diep »

jdart wrote:I do align hash tables and other large structures.

I think it helps but you should not expect a large gain. Most times your threads are not accessing overlapping cache lines in the hash table, anyway.

--Jon
It might help at old hardware.

At newer hardware i measure no difference with diep, which aligns itself.