Passing int64 (bitboard) by value or by reference?

Discussion of chess software programming and technical issues.

Moderator: Ras

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Passing int64 (bitboard) by value or by reference?

Post by mcostalba »

On a 32bit system it is faster to pass a bitboard (unsigned int 64bit) by value or by reference?

void do_something(Bitboard b);

or

void do_something(const Bitboard& b);


I know I can test myself (and I'm going to do it) but my tests are limited to my platform and compiler, perhaps someone has more general and proven results.

Thanks in advance
Marco


P.S: as a subcase consider also inline functions

inline void do_something(Bitboard b);

or

inline void do_something(const Bitboard& b);
Arash

Re: Passing int64 (bitboard) by value or by reference?

Post by Arash »

Hi,

If your sole purpose is passing a 64bit value, then in a 32bit system, passing by reference will be faster, because it only copies a 32bit address. But using a reference inside a function has limitations, e.g. changing its value inside the callee has side effects in the caller and to overcome this if you copy the value inside the callee, then passing by reference will be slower.

For inline functions passing by reference means there is no copying and so if you do not change the value it will be faster and if you copy the value to change it the speed will be the same.


Arash
Aleks Peshkov
Posts: 977
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: Passing int64 (bitboard) by value or by reference?

Post by Aleks Peshkov »

On x86 system the difference would be minimal, random and depend on context.

You will want to port your program to 64-bit sooner or later, so passing by value is better to avoid code rewriting. If you plan to port your program to small devices, then you may have to use references, because it may be a case when compiler could not support passing 64-bit values.

I have similar problem, Microsoft C++ compiler does not support passing 128-bit SSE variables by value, but Intel and GCC do. It is not possible to write best code without conditional preprocessor tricks.
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Passing int64 (bitboard) by value or by reference?

Post by Gerd Isenberg »

mcostalba wrote:On a 32bit system it is faster to pass a bitboard (unsigned int 64bit) by value or by reference?

void do_something(Bitboard b);

or

void do_something(const Bitboard& b);
At some point the bitboards need to fetched inside a register or register pair anyway, either inside the callee, or if passing by fastcall already inside the caller.

One may argue that pushing 64-bit parameters takes more space on the stack (and time) than pushing a 32-bit pointer (reference), but single "fastcall" parameters by value in x86 edx:ecx are still preferable in 32-bit mode, assuming enough scratch registers available inside the callee. Passing by const ref has some issues with expressions or constant bitboards, since the compiler has to store the results like an automatic variable on the stack, to pass a pointer on this memory location as reference to the callee.

I would always pass supported basic data types (long long or __int64) by value instead of const ref - assuming only one or up to two 64-bit arguments even if it might tad slower up and then on 32-bit platforms. For (small) inliners compiler will likely emit same code anyway.
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Passing int64 (bitboard) by value or by reference?

Post by Gerd Isenberg »

Aleks Peshkov wrote: I have similar problem, Microsoft C++ compiler does not support passing 128-bit SSE variables by value, but Intel and GCC do. It is not possible to write best code without conditional preprocessor tricks.
Really? I just compiled following code with VC2008 on a 32-bit windows system with a P4 and it worked:

Code: Select all

#include <emmintrin.h>

__m128i westOne(__m128i b) {
   b = _mm_srli_epi64 (b, 1);
   b = _mm_add_epi8   (b, b);
   b = _mm_srli_epi64 (b, 1);
   return b;
}

Code: Select all

_TEXT	SEGMENT
?westOne@@YA?AT__m128i@@T1@@Z PROC			; westOne, COMDAT
; _b$ = xmm0
  00000	66 0f 73 d0 01	 psrlq	 xmm0, 1
  00005	66 0f fc c0	 paddb	 xmm0, xmm0
  00009	66 0f 73 d0 01	 psrlq	 xmm0, 1
  0000e	c3		 ret	 0
?westOne@@YA?AT__m128i@@T1@@Z ENDP			; westOne
_TEXT	ENDS
The only issue I had here

Code: Select all

__m128i _mm_cvtsi64_si128(__int64);
__int64 _mm_cvtsi128_si64(__m128i);
were not available due to <emmintrin.h>

Code: Select all

#if defined(_M_AMD64)
extern __int64 _mm_cvtsd_si64(__m128d);
extern __int64 _mm_cvttsd_si64(__m128d);
extern __m128d _mm_cvtsi64_sd(__m128d, __int64);
extern __m128i _mm_cvtsi64_si128(__int64);
extern __int64 _mm_cvtsi128_si64(__m128i);
/* Alternate intrinsic name definitions */
#define _mm_stream_si64 _mm_stream_si64x
#endif
Therefor this test caller:

Code: Select all

unsigned long long foo(unsigned long long x) {
	__m128i a;
	a = _mm_loadl_epi64((__m128i *)&x);
	a = westOne(a);
	_mm_storel_epi64((__m128i *)&x, a);
	return x;
}
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Passing int64 (bitboard) by value or by reference?

Post by mcostalba »

Aleks Peshkov wrote: You will want to port your program to 64-bit sooner or later, so passing by value is better to avoid code rewriting. If you plan to port your program to small devices, then you may have to use references, because it may be a case when compiler could not support passing 64-bit values.
My program is already 64bit compatible, but works also on 32bit system.

Actually what I plan to do is

Code: Select all

#if defined is_64BIT
typedef Bitboard BitboardArg;
#else
typedef const Bitboard& BitboardArg;
#endif
and then the function that use bitboards (without modifying them) become
void do_something(BitboardArg b);
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: Passing int64 (bitboard) by value or by reference?

Post by mcostalba »

Gerd Isenberg wrote:
mcostalba wrote: I would always pass supported basic data types (long long or __int64) by value instead of const ref - assuming only one or up to two 64-bit arguments even if it might tad slower up and then on 32-bit platforms. For (small) inliners compiler will likely emit same code anyway.
Thanks for your answer.

Sorry if I ask, but have you already tested this in the past?

Thanks
Marco
Aleks Peshkov
Posts: 977
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: Passing int64 (bitboard) by value or by reference?

Post by Aleks Peshkov »

Gerd Isenberg wrote:
Aleks Peshkov wrote: I have similar problem, Microsoft C++ compiler does not support passing 128-bit SSE variables by value, but Intel and GCC do. It is not possible to write best code without conditional preprocessor tricks.
Really? I just compiled following code with VC2008 on a 32-bit windows system with a P4 and it worked:
I tested in VC2008 Express (32-bit mode).

Code: Select all

error C2719: 'unnamed-parameter': formal parameter with __declspec(align('16')) won't be aligned
It is documented that VC does not use SSE registers for passing SSE values in 64-bit. It seems it has problems to convert values to references.

I tried to use SSE args in class methods with several other parameters, may be it is the reason.
Aleks Peshkov
Posts: 977
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: Passing int64 (bitboard) by value or by reference?

Post by Aleks Peshkov »

Well, it does understand __m128i arguments, but failed to compile

Code: Select all

class Test {
    __m128i n;
};

void test(Test) {}
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: Passing int64 (bitboard) by value or by reference?

Post by Gerd Isenberg »

mcostalba wrote: Sorry if I ask, but have you already tested this in the past?
Not with native bitboards, at least recently. With 64-bit cpu in mind it doesn't make much sense to me.

With quad bitboard routines, using an own wrapper class and <SSE> as class template parameter, I pass const ref as input parameters to inlined operators and the generated assembly looks quite optimal for let say several direction kogge stone fills. Stuff is kept in xmm-registers over inlined function call boundaries, only prologue and epilogue works on memory. The debug version works as well, but is about 100 times slower ;-)