gcc miscompilation in Stockfish 1.6

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

gcc miscompilation in Stockfish 1.6

Post by mcostalba »

I spent a good amount of today trying to hunt down a nasty optimization bug of gcc that miscompiles value.h with -O3 and with some version even with -O2, instead -O1 seems to work.

In particular one of the miscompiled functions (but not the only one) is:

Code: Select all

inline void operator+=(Score& s1, Score s2) { s1 = Score(int(s1) + int(s2)); }

I really don't have any clue why on Intel and MSVC it works why gcc miscompiles. I have found it because adding debug info around the assignment instruction changes something and gcc starts to compile it correctly !!! :shock:

Unfortunately this was not the only one, so that instead of trying to find a workaround I opted for setting optimization under gcc to -O1 (the official builds are not affected because are done with Intel and MSVC).

This is of course a quick fix, but I really would like to understand what makes that (simple) assignment to be miscompiled.

Anyone has some ideas ?


Thanks
Marco
bnemias
Posts: 373
Joined: Thu Aug 14, 2008 3:21 am
Location: Albuquerque, NM

Re: gcc miscompilation in Stockfish 1.6

Post by bnemias »

I don't have any ideas since I just noticed the problem today-- I had been using builds on MIPS without issue even at -O3 with gcc 4.3.3.

But when I tested x86 builds using GCC 4.4.1, I noticed the problem. In my case, it was a segfault after it hits depth 11. Reducing the optimizations to O1 did solve it, so I assume what I saw as a segfault was related to valuation issues.

Another problem:

Code: Select all

g++ -O3 -msse -DNDEBUG -g -Wall -fno-exceptions -fno-rtti   -c -o bitboard.o bitboard.cpp
bitboard.cpp&#58; In function ‘void<unnamed>&#58;&#58;init_sliding_attacks&#40;Bitboard*, int*, Bitboard*, const int*, const Bitboard*, int (*)&#91;2&#93;)’&#58;
bitboard.cpp&#58;359&#58; warning&#58; dereferencing pointer ‘u’ does break strict-aliasing rules
bitboard.cpp&#58;363&#58; warning&#58; dereferencing pointer ‘u’ does break strict-aliasing rules
bitboard.cpp&#58;354&#58; note&#58; initialized from here
That's in Square pop_1st_bit(Bitboard* bb) that we discussed via email, trying to get rid of the -fno-strict-aliasing.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

bnemias wrote:I don't have any ideas since I just noticed the problem today-- I had been using builds on MIPS without issue even at -O3 with gcc 4.3.3.
I also know that 4.2.3 works fine too, it is the 4.4.X series that shows the problem, but I don't think it is related to -fno-strict-aliasing becasue I have already tested with this option and result is wrong the same.

The problem is the miscompilation of the operators of the enum Score in value.h I am quite sure of it because the revision before the conversion to enum Score works.
User avatar
Bo Persson
Posts: 243
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: gcc miscompilation in Stockfish 1.6

Post by Bo Persson »

mcostalba wrote:I spent a good amount of today trying to hunt down a nasty optimization bug of gcc that miscompiles value.h with -O3 and with some version even with -O2, instead -O1 seems to work.

In particular one of the miscompiled functions (but not the only one) is:

Code: Select all

inline void operator+=&#40;Score& s1, Score s2&#41; &#123; s1 = Score&#40;int&#40;s1&#41; + int&#40;s2&#41;); &#125;

I really don't have any clue why on Intel and MSVC it works why gcc miscompiles. I have found it because adding debug info around the assignment instruction changes something and gcc starts to compile it correctly !!! :shock:

Unfortunately this was not the only one, so that instead of trying to find a workaround I opted for setting optimization under gcc to -O1 (the official builds are not affected because are done with Intel and MSVC).

This is of course a quick fix, but I really would like to understand what makes that (simple) assignment to be miscompiled.

Anyone has some ideas ?


Thanks
Marco
I don't think this is legal code:

enum Score {};

The compiler is allowed to chose the size of an enum variable based on the values it is expected to store. Here you have given it no values!

For example, this type

enum x { zero, one };

requires a minimum of one bit to store the values. Some compilers will use a single byte for the storage.

On the other hand, this type will use at least 16 bits

enum y { value = 32767 };
User avatar
Bo Persson
Posts: 243
Joined: Sat Mar 11, 2006 8:31 am
Location: Malmö, Sweden
Full name: Bo Persson

Re: gcc miscompilation in Stockfish 1.6

Post by Bo Persson »

bnemias wrote:I don't have any ideas since I just noticed the problem today-- I had been using builds on MIPS without issue even at -O3 with gcc 4.3.3.

But when I tested x86 builds using GCC 4.4.1, I noticed the problem. In my case, it was a segfault after it hits depth 11. Reducing the optimizations to O1 did solve it, so I assume what I saw as a segfault was related to valuation issues.

Another problem:

Code: Select all

g++ -O3 -msse -DNDEBUG -g -Wall -fno-exceptions -fno-rtti   -c -o bitboard.o bitboard.cpp
bitboard.cpp&#58; In function ‘void<unnamed>&#58;&#58;init_sliding_attacks&#40;Bitboard*, int*, Bitboard*, const int*, const Bitboard*, int (*)&#91;2&#93;)’&#58;
bitboard.cpp&#58;359&#58; warning&#58; dereferencing pointer ‘u’ does break strict-aliasing rules
bitboard.cpp&#58;363&#58; warning&#58; dereferencing pointer ‘u’ does break strict-aliasing rules
bitboard.cpp&#58;354&#58; note&#58; initialized from here
That's in Square pop_1st_bit(Bitboard* bb) that we discussed via email, trying to get rid of the -fno-strict-aliasing.
The latest versions of g++ is flagging violations of the language!

Code: Select all

Square pop_1st_bit&#40;Bitboard* bb&#41; &#123;

   b_union* u;
   Square ret;

   u = &#40;b_union*&#41;bb;   <--- Not allowed!

Casting a pointer type to an unrelated pointer type IS a violation of the strict aliasing rules. The compiler is allowed to assume that a variable is never modified through a pointer to another type.

The gcc documentation recommends storing the value in the union and accessing it as the other type there. While still strictly against the language rules, this is a documented extension supported by most compilers.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc miscompilation in Stockfish 1.6

Post by Sven »

Bo Persson wrote:
mcostalba wrote:In particular one of the miscompiled functions (but not the only one) is:

Code: Select all

inline void operator+=&#40;Score& s1, Score s2&#41; &#123; s1 = Score&#40;int&#40;s1&#41; + int&#40;s2&#41;); &#125;
I don't think this is legal code:

enum Score {};

The compiler is allowed to chose the size of an enum variable based on the values it is expected to store. Here you have given it no values!
I can only second that, also from the viewpoint of "clean programming". An enumeration type should have distinct values only, e.g. White, Black, Empty or NoPiece, Pawn, Knight, ..., King. A "score" is something very different unless you enumerate *all* possible values, so it should have a simple definition like "typedef int Score". Attempts to be more clever than that tend to fail at some point, according to my experience at least.

EDIT: With "typedef int Score", user-defined operators for that type like "operator+=" would be redundant.

Sven
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Sven Schüle wrote:
Bo Persson wrote:
mcostalba wrote:In particular one of the miscompiled functions (but not the only one) is:

Code: Select all

inline void operator+=&#40;Score& s1, Score s2&#41; &#123; s1 = Score&#40;int&#40;s1&#41; + int&#40;s2&#41;); &#125;
I don't think this is legal code:

enum Score {};

The compiler is allowed to chose the size of an enum variable based on the values it is expected to store. Here you have given it no values!
I can only second that, also from the viewpoint of "clean programming". An enumeration type should have distinct values only, e.g. White, Black, Empty or NoPiece, Pawn, Knight, ..., King. A "score" is something very different unless you enumerate *all* possible values, so it should have a simple definition like "typedef int Score". Attempts to be more clever than that tend to fail at some point, according to my experience at least.

EDIT: With "typedef int Score", user-defined operators for that type like "operator+=" would be redundant.

Sven
yes typedef of int could be a very good idea, but I would like to intercept divsion by an integer value:

Code: Select all

// Division must be handled separately for each term
inline Score operator/&#40;Score s, int i&#41; &#123; return make_score&#40;mg_value&#40;s&#41; / i, eg_value&#40;s&#41; / i&#41;; &#125;
I don't think I can do with a typedef .....
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Bo Persson wrote: The gcc documentation recommends storing the value in the union and accessing it as the other type there. While still strictly against the language rules, this is a documented extension supported by most compilers.
Thanks !!!

what about this version ?

Code: Select all

// Use type-punning
union b_union &#123;

    Bitboard b;
    struct &#123;
#if defined &#40;BIGENDIAN&#41;
        uint32_t h;
        uint32_t l;
#else
        uint32_t l;
        uint32_t h;
#endif
    &#125; dw;
&#125;;

Square pop_1st_bit&#40;Bitboard* bb&#41; &#123;

   b_union u;
   Square ret;

   u.b = *bb;

   if &#40;u.dw.l&#41;
   &#123;
       ret = Square&#40;BitTable&#91;(&#40;u.dw.l ^ &#40;u.dw.l - 1&#41;) * 0x783a9b23&#41; >> 26&#93;);
       u.dw.l &= &#40;u.dw.l - 1&#41;;
       *bb = u.b;
       return ret;
   &#125;
   ret = Square&#40;BitTable&#91;((~&#40;u.dw.h ^ &#40;u.dw.h - 1&#41;)) * 0x783a9b23&#41; >> 26&#93;);
   u.dw.h &= &#40;u.dw.h - 1&#41;;
   *bb = u.b;
   return ret;
&#125;

I have already verified is functional identical.
mcostalba
Posts: 2684
Joined: Sat Jun 14, 2008 9:17 pm

Re: gcc miscompilation in Stockfish 1.6

Post by mcostalba »

Bo Persson wrote: On the other hand, this type will use at least 16 bits

enum y { value = 32767 };
I have tried with this

Code: Select all

/// Score enum keeps a midgame and an endgame value in a single
/// integer &#40;enum&#41;, first LSB 16 bits are used to store endgame
/// value, while upper bits are used for midgame value.

enum Score &#123; ENSURE_32_BITS = 1 << 31 &#125;;
And indeed the result is different, but still not the same as the intel compiled reference one :shock:
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: gcc miscompilation in Stockfish 1.6

Post by Sven »

mcostalba wrote:yes typedef of int could be a very good idea, but I would like to intercept divsion by an integer value:

Code: Select all

// Division must be handled separately for each term
inline Score operator/&#40;Score s, int i&#41; &#123; return make_score&#40;mg_value&#40;s&#41; / i, eg_value&#40;s&#41; / i&#41;; &#125;
I don't think I can do with a typedef .....
Don't bury this kind of special handling within a standard division operator. Use a normal function for it instead. Otherwise you are creating code which can be hard to understand for others. Special semantics should be visible to the reader if possible.

And you are also about to create weird effects somewhere else in the program. Suppose you have an expression "int s = some_value / 2;" and later on you decide to change the type of "some_value" to "Score", what do you expect to happen then? You probably won't notice it (maybe you get a compiler warning which annoys you and you add a cast to int) but the result will be surprising. Using operators can be nice sometimes but this is not a good example IMO.

Sven