Page 3 of 4

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 6:40 pm
by Ras
Dann Corbit wrote:My recommendation is to use both GCC and CLANG with warnings turned up to crazy maximum and examine each and every one. (Expect thousands).
A quick check with CppCheck didn't show much besides scoping, but I'm not sure whether reduced scopes actually help speed. Maybe.

GCC under Linux offers tons of "sanitizer" options with checks during runtime. Slow of course, but good for spotting errors.

Coverity Scan is a nice option because it's free for Open Source projects; I'm also using it.

And btw., the makefile already has "-Wall -Wextra -Wshadow", so the warnings are already cranked up.

However, there is one thing where I'm not too sure. There is no clear compiler directive how to deal with pointer aliasing, and GCC changed that with 4.9 towards (ab)using it for optimisation. It's totally easy to get this in, and unless benchmarking shows a clear gain, I'd always use "-fno-strict-aliasing" for release, and while we're at it, also "-fno-strict-overflow".

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 6:48 pm
by abulmo2
AndrewGrant wrote:Final question. GCC or G++?

I tried doing a PGO build, and actually got a different bench... which really confuses me.
I try to play with your code and get the same observation. As the gcc with -Ofast flag and the icc compiler give different bench numbers from gcc -O3, I wonder if there are not rounding errors after some floating point computations and conversions to integer values.

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 6:52 pm
by AndrewGrant
I do not believe I make use of any floating point values, aside from when the Tuner is run.

I use some doubles when dealing with time on the clock, but those values are ignored when the benchmark is run

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 6:58 pm
by Ras
abulmo2 wrote:gcc -O3
That's another thing: O3 is not advised as it can well make the program slower. O2 is the usual optimisation level that doesn't backfire.

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 6:58 pm
by abulmo2
icc emits the following warnings for example:

Code: Select all

search.c(135): warning #2259: non-pointer conversion from "double" to "int" may lose significant bits
          margin =             1.6 * (abs(values[depth - 1] - values[depth - 2]));
                 ^

search.c(136): warning #2259: non-pointer conversion from "double" to "int" may lose significant bits
          margin = MAX(margin, 2.0 * (abs(values[depth - 2] - values[depth - 3])));
                 ^

search.c(137): warning #2259: non-pointer conversion from "double" to "int" may lose significant bits
          margin = MAX(margin, 0.8 * (abs(values[depth - 3] - values[depth - 4])));
                 ^

search.c(274): warning #2259: non-pointer conversion from "double" to "int" may lose significant bits
          futilityMargin = eval + depth * 0.95 * PieceValues[PAWN][EG];
                         ^

search.c(302): warning #2259: non-pointer conversion from "double" to "int" may lose significant bits
          value = eval - depth * 0.95 * PieceValues[PAWN][EG];

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 7:01 pm
by AndrewGrant
Thank you for this. I'll clean those up and see what happens.

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 7:03 pm
by abulmo2
Ras wrote:That's another thing: O3 is not advised as it can well make the program slower. O2 is the usual optimisation level that doesn't backfire.
I have always seen the opposite ie faster program with -O3. It seems to be the case with Ethereal too, at least on my computer (an old sandybridge, with gcc 6.3 and 7.2 under linux fedora 26 / 27).

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 7:21 pm
by syzygy
abulmo2 wrote:
Ras wrote:That's another thing: O3 is not advised as it can well make the program slower. O2 is the usual optimisation level that doesn't backfire.
I have always seen the opposite ie faster program with -O3. It seems to be the case with Ethereal too, at least on my computer (an old sandybridge, with gcc 6.3 and 7.2 under linux fedora 26 / 27).
Yes, and it would be rather strange if the gcc developers let -O3 enable known "de-optimisations".

Of course any particular optimisation can backfire for a particular program. Experimenting with different optimisation options will not hurt.

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 7:25 pm
by AndrewGrant
Upon cleaning those up...

I'll get the regular bench with the final PGO build, BUT during the profile-generate stage, I'll get a different bench.

Currently installing gcc7.2 from source on this machine so I can use -fsanitize=undefined...

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Posted: Tue Nov 28, 2017 7:29 pm
by syzygy
AndrewGrant wrote:I tried doing a PGO build, and actually got a different bench... which really confuses me.
Compile with -fsanitize=undefined:

Code: Select all

search.c:386:47: runtime error: index -1 out of bounds for type 'int [9]'
search.c:386:

Code: Select all

            &&  quiets > LateMovePruningCounts[depth]
Apparently depth can be -1 in this line.