gcc4.8 outperforming gcc5, gcc6, gcc7

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by syzygy »

D Sceviour wrote:
syzygy wrote:That higher optimisation levels can turn "minor bugs" into crashes is not a reason not to use optimisation. Such "minor bugs" simply need to be fixed. If use of an uninitialised variable does not crash the program, it will likely make it produce incorrect results (which can be almost impossible to notice in a chess engine, except for a measurable decrease in playing strength).
The gcc <variable> "may be used uninitialized" gives unpredictable bogus results. I visually inspect each element and then ignore the warnings if there is nothing wrong. Of course, something else may be triggering the warning.
Yes, gcc sometimes produces bogus warnings (the most annoying being array out of bounds where no array is accessed out of bounds), but where it is right, the proper solution is to fix it and not to accept it as a "minor bug" and compile without optimisation hoping there is no harm ;-)

If there is a bug, a crash is the best thing that can happen since it ensures that the bug is detected. The problem with some of these bugs is that the crash (not the bug) goes away when compiling in debug mode. That makes it harder to locate them. But nowadays it is pretty easy to locate them using -fsanitize.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by AndrewGrant »

Code: Select all

// Using Linked makefile
BENCH DEPTH |  COMP  | NPS
     20     | GCC4.8 | 3592466
     20     | GCC5.4 | 3367175
     20     | GCC6.4 | 3366372
     20     | GCC7.2 | 3246964
     
// Using Linked makefile plus -flto
BENCH DEPTH |  COMP  | NPS
     20     | GCC4.8 | 3611056
     20     | GCC5.4 | 3498027
     20     | GCC6.4 | 3550826
     20     | GCC7.2 | 3323299
     
GCC4.8 with    -flto BENCH 22 NPS=3553729

GCC4.8 without -flto BENCH 22 NPS=3587155
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by AndrewGrant »

Code: Select all

// Using Linked makefile
BENCH DEPTH |  COMP  | NPS
     20     | GCC4.8 | 3592466
     20     | GCC5.4 | 3367175
     20     | GCC6.4 | 3366372
     20     | GCC7.2 | 3246964
     
// Using Linked makefile plus -flto
BENCH DEPTH |  COMP  | NPS
     20     | GCC4.8 | 3611056
     20     | GCC5.4 | 3498027
     20     | GCC6.4 | 3550826
     20     | GCC7.2 | 3323299
     
GCC4.8 with    -flto BENCH 22 NPS=3553729

GCC4.8 without -flto BENCH 22 NPS=3587155
[/quote]
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Ras
Posts: 2487
Joined: Tue Aug 30, 2016 8:19 pm
Full name: Rasmus Althoff

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by Ras »

syzygy wrote:He posted a link to the Makefile (he compiles with -O3).
It is well-known that -O3 can produce slower binaries. Should also be compared with what happens with -O2.
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by Dann Corbit »

I definitely get best performance with gcc 7.2. It beats all my other compilers handily.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by AndrewGrant »

What OS?
What CPU?
What Flags?
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by Dann Corbit »

AndrewGrant wrote:What OS?
Windows 10, Windows 2012 Server

What CPU?
Intel and AMD

What Flags?
O3, pgo and the typical flag set. Something I always add that I rarely see others use is:

Code: Select all

	ifeq ($&#40;comp&#41;,mingw&#41;
		CXXFLAGS += -mtune=native
        endif
and I always link statically in case someone wants a copy of the binary.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by AndrewGrant »

Final question. GCC or G++?

I tried doing a PGO build, and actually got a different bench... which really confuses me.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
Dann Corbit
Posts: 12538
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by Dann Corbit »

AndrewGrant wrote:Final question. GCC or G++?
Both.

I tried doing a PGO build, and actually got a different bench... which really confuses me.
I guess that you have undefined behavior in your code.
My recommendation is to use both GCC and CLANG with warnings turned up to crazy maximum and examine each and every one. (Expect thousands).

Now, I assume by bench you mean something that should be reproducible like perft or perhaps a single threaded search. I would not expect a multi-threaded search to give the same result even on the same machine and binary when repeated. It is possible for a somewhat different single threaded bench on a search to be correct. What I mean is that the code is slightly different with things like inlining instead of function calls. Most of the time these changes make no difference. If you have floating point anywhere in your program, that can do all kinds of whacky things. For instance, the total of a long column of floating point numbers which vary greatly in size will be different if you sum them forward or backwards and different again if you sort them first. Even using Kahan's adder won't completely fix that sort of thing. It only reduces the effect.


What exactly does your bench do?
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: gcc4.8 outperforming gcc5, gcc6, gcc7

Post by AndrewGrant »

I already do gcc -Wall -Wextra -Wshadow, but I'll look for more flags. I currently get no warnings when I compile.

I have a bench the same way stockfish does. Do a depth 13 search on a set of positions. Single threaded. I have no problem reproducing the bench on any non PGO compile, accross the 7+ machines I've run it on

I'll see what I can do tonight, when I get back to my computer with clang
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )