GCC 8.1 vs GCC 10.1

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
lucasart
Posts: 3109
Joined: Mon May 31, 2010 11:29 am
Full name: lucasart
Contact:

Re: GCC 8.1 vs GCC 10.1

Post by lucasart » Sat Jul 04, 2020 10:07 am

Ras wrote:
Sat Jul 04, 2020 7:04 am
lucasart wrote:
Sat Jul 04, 2020 12:38 am
Compilation (same for both GCC and Clang):
-O3 isn't always the fastest. What if you try -O2 instead?
I've tried. O3 is faster. But not by much. Perhaps I should spill the guts out of O3 - O2, and try each optimisation in that set one by one, to see which one is useful. It could be a mixed bag of good and bad ones.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.

syzygy
Posts: 4598
Joined: Tue Feb 28, 2012 10:56 pm

Re: GCC 8.1 vs GCC 10.1

Post by syzygy » Sat Jul 04, 2020 11:37 am

lucasart wrote:
Sat Jul 04, 2020 10:07 am
Ras wrote:
Sat Jul 04, 2020 7:04 am
lucasart wrote:
Sat Jul 04, 2020 12:38 am
Compilation (same for both GCC and Clang):
-O3 isn't always the fastest. What if you try -O2 instead?
I've tried. O3 is faster. But not by much. Perhaps I should spill the guts out of O3 - O2, and try each optimisation in that set one by one, to see which one is useful. It could be a mixed bag of good and bad ones.
For Cfish on my old i3930k (Sandybridge) and using gcc-7.x, I arrived at:

Code: Select all

-O3 -fira-loop-pressure -fconserve-stack -fmodulo-sched -fmodulo-sched-allow-regmoves -fsched-pressure -flimit-function-alignment -fno-tree-pre
This still does considerably better than -O3 on my Kaby Lake laptop with gcc-10.x. But I should probably redo the exercise some time.

For some reason LTO does not help for Cfish at all, at least not on Intel. I suspect mainly because there is not to much to optimise anymore between compilation units. The LTO build is about 10k smaller and I would like to know why (because that should help explain why the LTO build is slower), but there doesn't seem to be a way to get the assemly output for the LTO binary. (I could use a disassembler but I won't.)

Post Reply