gcc miscompilation in Stockfish 1.6

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Bo Persson wrote: Trust your compiler!
Ok, I have rewritten Score as this:

Code: Select all

/// Score struct keeps a midgame and an endgame value in a single
/// integer, first LSB 16 bits are used to store endgame
/// value, while upper bits are used for midgame value.

struct Score {

    Score() {}
    Score(int val) : v(val) {}

    int v;
};

inline Value eg_value(Score s) { return Value(int16_t(s.v & 0xffff)); }
inline Value mg_value(Score s) { return Value((s.v + 32768) >> 16); }

inline Score make_score&#40;int mg, int eg&#41; &#123; return Score&#40;&#40;mg << 16&#41; + eg&#41;; &#125;

inline bool operator!=&#40;Score s1, Score s2&#41; &#123; return s1.v != s2.v; &#125;
inline Score operator-&#40;Score s&#41; &#123; return Score&#40;-s.v&#41;; &#125;
inline Score operator+&#40;Score s1, Score s2&#41; &#123; return Score&#40;s1.v + s2.v&#41;; &#125;
inline Score operator-&#40;Score s1, Score s2&#41; &#123; return Score&#40;s1.v - s2.v&#41;; &#125;
inline void operator+=&#40;Score& s1, Score s2&#41; &#123; s1.v += s2.v; &#125;
inline void operator-=&#40;Score& s1, Score s2&#41; &#123; s1.v -= s2.v; &#125;
inline Score operator*&#40;int i, Score s&#41; &#123; return Score&#40;i * s.v&#41;; &#125;

// Division must be handled separately for each term
inline Score operator/&#40;Score s, int i&#41; &#123; return make_score&#40;mg_value&#40;s&#41; / i, eg_value&#40;s&#41; / i&#41;; &#125;

// Only declared but not defined. We don't want to multiply two scores due to
// a very high risk of overflow. So user should explicitly convert to integer.
inline Score operator*&#40;Score s1, Score s2&#41;;
But I get the same result of forcing enum to an integer with

Code: Select all

enum Score &#123; a = INT_MIN, b = INT_MAX &#125;;
So there has to be something else somewhere .....also in this case with -O1 it works :shock:

I didn't verified the code is optimized.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: gcc miscompilation in Stockfish 1.6

Post by bob »

mcostalba wrote:I spent a good amount of today trying to hunt down a nasty optimization bug of gcc that miscompiles value.h with -O3 and with some version even with -O2, instead -O1 seems to work.

In particular one of the miscompiled functions (but not the only one) is:

Code: Select all

inline void operator+=&#40;Score& s1, Score s2&#41; &#123; s1 = Score&#40;int&#40;s1&#41; + int&#40;s2&#41;); &#125;

I really don't have any clue why on Intel and MSVC it works why gcc miscompiles. I have found it because adding debug info around the assignment instruction changes something and gcc starts to compile it correctly !!! :shock:

Unfortunately this was not the only one, so that instead of trying to find a workaround I opted for setting optimization under gcc to -O1 (the official builds are not affected because are done with Intel and MSVC).

This is of course a quick fix, but I really would like to understand what makes that (simple) assignment to be miscompiled.

Anyone has some ideas ?


Thanks
Marco
I think you have actually wasted some time. I don't see how this is any faster than just using two separate values, whether they are 32 bits or 64 bits is a matter for testing however. But walking over the edge in terms of syntactical tricks invites compilers to do different things since the code is ambiguous to the optimizer. The goal is to write code that everybody gets right. GCC is a very robust compiler, breaking it is almost always a user issue rather than a compiler issue. As it is here.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

bob wrote: GCC is a very robust compiler, breaking it is almost always a user issue rather than a compiler issue. As it is here.
If with -O1 and -O3 the functionality is different it is a compiler bug by definition. BTW rewriting Score as a struct instead of an enum (as posted above) does not change anything.

We already know it is not faster, but it is cleaner in this way because you avoid a lot of double assignments / updates scattered through the code (-70 lines of code if I remember correctly).
User avatar
Bo Persson
Posts: 243
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: gcc miscompilation in Stockfish 1.6

Post by Bo Persson »

mcostalba wrote:
bob wrote: GCC is a very robust compiler, breaking it is almost always a user issue rather than a compiler issue. As it is here.
If with -O1 and -O3 the functionality is different it is a compiler bug by definition.
No, it is almost always a bug in the user code. The -O3 level pushes the envelope a bit further, and exposes more problems in the code. The 4.4 and 4.5 versions of gcc are more agressive in code transformations than the previous versions.

In my experience you find a genuine code generation bug about once every 10 years. This might very well be your turn, but I wouldn't bet on it.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Bo Persson wrote: No, it is almost always a bug in the user code. The -O3 level pushes the envelope a bit further, and exposes more problems in the code. The 4.4 and 4.5 versions of gcc are more agressive in code transformations than the previous versions.
Yes, I have to agree.

I am really going mad to find the reason of the miscompile, changing the enum in a struct doesn't fixed the problem and I am sure it is the new 'Score' code that introduced the issue because until that revision gcc and icc binaries had same functionality.

So now my biggest suspect are theese:

Code: Select all

inline Value eg_value&#40;Score s&#41; &#123; return Value&#40;int16_t&#40;s.v & 0xffff&#41;); &#125;
inline Value mg_value&#40;Score s&#41; &#123; return Value&#40;&#40;s.v + 32768&#41; >> 16&#41;; &#125; 
The struct and enum versions of these functions are almost identical, that's the reason I suspect something wrong here.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc miscompilation in Stockfish 1.6

Post by Sven »

Bo Persson wrote:Come on, you are copying integers - how expensive is that?

Score s1 = Score(4);

is exactly the same as

int32_t s1 = 4;

The problem we have seen is that the compiler optimzes the code too much, not that it cannot see these simple cases!

Trust your compiler!
Fully agreed. Furthermore, C++ allows writing it the easier way:

Code: Select all

Score s1&#40;4&#41;;
Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc miscompilation in Stockfish 1.6

Post by Sven »

mcostalba wrote:Ok, I have rewritten Score as this:
[...]
But I get the same result of forcing enum to an integer with
[...]
So there has to be something else somewhere .....also in this case with -O1 it works :shock:
Marco, please check your PM.

Sven
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc miscompilation in Stockfish 1.6

Post by Sven »

gcc -O2/-O3 automatically enables -fstrict-aliasing. Could this be related to your trouble somehow? Maybe there are more issues like the one with b_union that has been mentioned here?

Sven
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Bo Persson wrote: In my experience you find a genuine code generation bug about once every 10 years. This might very well be your turn, but I wouldn't bet on it.
May be it is my turn ;-)

I think I have found the miscompiled line, with following version it works for me:

Code: Select all

/// Score enum keeps a midgame and an endgame value in a single
/// integer &#40;enum&#41;, first LSB 16 bits are used to store endgame
/// value, while upper bits are used for midgame value.

enum Score &#123; ENSURE_32_BIT_SIZE = 1 << 31 &#125;;

inline Value eg_value&#40;Score s&#41; &#123; return Value&#40;(&#40;s & 0xffff&#41; << 16&#41; >> 16&#41;; &#125;
The first thing is to ensure gcc uses an integer for the enum and not a smaller type (and this is allowed by standard so is not a gcc bug), but the crucial point is the rewrite of the eg_value() function that now gives the correct result also with gcc -O3 (and I have verified both Intel and MSVC compile also this version correctly).

So the problem was this one:

Code: Select all

inline Value eg_value&#40;Score s&#41; &#123; return Value&#40;int16_t&#40;s & 0xffff&#41;); &#125;
Function eg_value() extracts the signed lower 16 bits from the 32 bit enum 's'. This function got miscompiled under gcc -O3 with some recent gcc version.


If someone could test with a 64 bit compile (I don't have a 64 bit machine), I would be very glad so that I can at last release 1.6.2 with all the stuff fixed, included a rewrite of pop_1st_bit() to avoid the reported aliasing issue.
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: gcc miscompilation in Stockfish 1.6

Post by Gian-Carlo Pascutto »

mcostalba wrote:
Bo Persson wrote: In my experience you find a genuine code generation bug about once every 10 years. This might very well be your turn, but I wouldn't bet on it.
May be it is my turn ;-)

I think I have found the miscompiled line, with following version it works for me:

Code: Select all

/// Score enum keeps a midgame and an endgame value in a single
/// integer &#40;enum&#41;, first LSB 16 bits are used to store endgame
/// value, while upper bits are used for midgame value.

enum Score &#123; ENSURE_32_BIT_SIZE = 1 << 31 &#125;;

inline Value eg_value&#40;Score s&#41; &#123; return Value&#40;(&#40;s & 0xffff&#41; << 16&#41; >> 16&#41;; &#125;
The first thing is to ensure gcc uses an integer for the enum and not a smaller type (and this is allowed by standard so is not a gcc bug), but the crucial point is the rewrite of the eg_value() function that now gives the correct result also with gcc -O3 (and I have verified both Intel and MSVC compile also this version correctly).
Note that if enum is (a signed) int, and the Scores can be negative (which I think they can for you), then the result of that >> 16 is entirely implementation defined. So the above code is still buggy, not the compiler.