For the Intel compiler experts

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: For the Intel compiler experts

Post by Don »

rbarreira wrote:Using the Intel C Compiler is fine if your executables are for personal use or if you only intend it to run on Intel CPUs.

If you intend for the code to run on AMD CPUs, it will pretty much run like crap because the CPU is detected not to be Intel and is given a crappy codepath (or doesn't start up at all if one of the -x options is used).
When is the AMD detected? Is it when the code is generated or is at runtime? If it's at runtime are you saying that the compiler generates 2 sets of code for 1 program? One good program for Intel and one crappy one for AMD?
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: For the Intel compiler experts

Post by wgarvin »

Don wrote:
rbarreira wrote:Using the Intel C Compiler is fine if your executables are for personal use or if you only intend it to run on Intel CPUs.

If you intend for the code to run on AMD CPUs, it will pretty much run like crap because the CPU is detected not to be Intel and is given a crappy codepath (or doesn't start up at all if one of the -x options is used).
When is the AMD detected? Is it when the code is generated or is at runtime? If it's at runtime are you saying that the compiler generates 2 sets of code for 1 program? One good program for Intel and one crappy one for AMD?
Pretty much, yeah :)

I'm sure we've discussed this before, but I couldn't find the thread. The Intel compiler can generate a cpuid check for e.g. SSE3+ code, and it will test at runtime to see if your chip's manufacturer ID is 'GenuineIntel' and if not, it will use a slower code path. These checks are also compiled into their standard libraries and various math libraries. From there, they made their way into benchmarking programs, many commercial software programs that use either Intel's compiler or Intel's libraries, etc. The whole idea is to make code run slower on competitor's CPUs.

More info can be found at Agner Fog's site. [Edit: this page of his blog has a lot of wild stuff in it too.

[Edit: the CPU dispatching is mostly an issue for SSE, SSE2, etc. If you don't use that stuff in your engine you might not notice any difference.]

[Edit: The CPU dispatching topic came up in this thread last October.]
Last edited by wgarvin on Sat Mar 12, 2011 12:49 am, edited 1 time in total.
User avatar
jshriver
Posts: 1358
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: For the Intel compiler experts

Post by jshriver »

Mincho Georgiev wrote: Start with something simple for pgo, like:
Excuse my ignorance, but what is pgo? It was mentioned elsewhere as well.
-Josh

Edit: nvm found it sounds neat
http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: For the Intel compiler experts

Post by bob »

CThinker wrote:
bob wrote:
Dann Corbit wrote:I am going to buy the latest Intel C++ compiler, but I would like to know what tool set I need to purchase.

They have "Parallel Studio" and "C++ composer" and all sorts of different collections of things.

I already have the high-end MS compiler and so I have a nice IDE and debugger already. But I would like to be able to produce the fastest possible binaries and so I need to know what is the minimum tool set I need to purchase to accomplish this goal.

Besides performance, I am also sick and tired of having to convert C99 programs to C89 in order to be able to compile them.
You are not going to get far with MS debugger and Intel compiler. You need a debugger that is compatible with the compiler's output. I simply use the intel C++ compiler + debugger package on all of my linux boxes. And you can get that for nothing. For windows I am not sure, but you do want both the compiler and debugger. Which might mean your IDE will present some issues.
Quote from:
http://software.intel.com/en-us/article ... -visual-c/

Debugging Capability and the Intel Parallel Debugger Extension

The Intel C++ Compiler is fully source- and binary- (native code only) compatible with Visual C++ 2005, Visual C++ 2008 and Visual C++ 2010 compiler when the option “/Qvc8”, “/Qvc9” or "/Qvc10" is specified. Binaries built with the Intel C++ Compiler can be debugged from within the Microsoft Visual Studio IDE.

It’s possible to build only several files or build several projects with the Intel C++ Compiler.

Intel Parallel Debugger Extension is a plug-in to Visual Studio Debugger. It provides great functionalities for debugging parallel programs. Please see the Intel(R) Parallel Debugger Extension article for detail information.
Remember, as I said, I was speaking from a unix standpoint. Don't run windows at all...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: For the Intel compiler experts

Post by bob »

Don wrote:I am experimenting with the intel C compiler right now and I'm wonder if there is anything I need to know to get more out of it.

It produced a binary with no changes to my source code, but I'm not familiar with optimizations that might apply to the Intel compiler. I'm using the same GCC options I was using before. I do get a 2 percent speedup without even experimenting with options.
Download recent crafty, and look at stock Makefile. The "profile" target shows the options to do PGO, which works really well on that compiler, where GCC generally croaks on a parallel program when you try to profile. You end up with corrupted profile data and the compiler barfs on the re-compile. Intel's works flawlessly for me....
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: For the Intel compiler experts

Post by bob »

jshriver wrote:
Mincho Georgiev wrote: Start with something simple for pgo, like:
Excuse my ignorance, but what is pgo? It was mentioned elsewhere as well.
-Josh

Edit: nvm found it sounds neat
http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler
Profile-Guided-Optimization. You run the program with some internal branch tests inserted. This determines whether a particular branch is usually taken or if it falls through. It can then re-organize the code so that the common path is "sequential" which is cache friendly. Etc...
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: For the Intel compiler experts

Post by Don »

bob wrote:
Don wrote:I am experimenting with the intel C compiler right now and I'm wonder if there is anything I need to know to get more out of it.

It produced a binary with no changes to my source code, but I'm not familiar with optimizations that might apply to the Intel compiler. I'm using the same GCC options I was using before. I do get a 2 percent speedup without even experimenting with options.
Download recent crafty, and look at stock Makefile. The "profile" target shows the options to do PGO, which works really well on that compiler, where GCC generally croaks on a parallel program when you try to profile. You end up with corrupted profile data and the compiler barfs on the re-compile. Intel's works flawlessly for me....
I just discovered to my horror that when I make a windows 64 bit executable using the mingw64 cross compiler, it's almost 10% slower than the Linux version. I can test this using my laptop which is dual boot for those times I absolutely must have windows.

There are GCC environments for Windows, I wonder if they produce better binaries than the cross compiler?

Anyway, I will take a look at your Makefile, thanks for the tip.

Don
Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 6:44 pm
Location: Bulgaria

Re: For the Intel compiler experts

Post by Mincho Georgiev »

Don wrote:
bob wrote:
Don wrote:I am experimenting with the intel C compiler right now and I'm wonder if there is anything I need to know to get more out of it.

It produced a binary with no changes to my source code, but I'm not familiar with optimizations that might apply to the Intel compiler. I'm using the same GCC options I was using before. I do get a 2 percent speedup without even experimenting with options.
Download recent crafty, and look at stock Makefile. The "profile" target shows the options to do PGO, which works really well on that compiler, where GCC generally croaks on a parallel program when you try to profile. You end up with corrupted profile data and the compiler barfs on the re-compile. Intel's works flawlessly for me....
I just discovered to my horror that when I make a windows 64 bit executable using the mingw64 cross compiler, it's almost 10% slower than the Linux version. I can test this using my laptop which is dual boot for those times I absolutely must have windows.

There are GCC environments for Windows, I wonder if they produce better binaries than the cross compiler?

Anyway, I will take a look at your Makefile, thanks for the tip.

Don
I've never had a case (which doesn't mean I'm 100% correct) when GCC produces faster INTEL CPU code than intel compiler,
not for any of the sources I've tried. I did a lot of experimenting in the past with a lot of source types and with every optimization flags available.
My opinion on this is just that icl (from 9 to 11th versions) is producing around 10-15 % faster code than gcc, and I'm not talking about chess engines only.
True, sometimes weird things going on: http://software.intel.com/en-us/forums/ ... hp?t=76060
and I had some cases when too aggressive optimization hurts the correctness of the results caused by wrong register preservation, but those are just minimal.

To Dan. Right now I'm evaluating the Intel Composer XE 2011 package. I will describe it only for windows, since i didn't tried it yet on linux.
The package comes with icl version 12, VTUNE, Intel threading blocks, intel math kernel library, performance primitives, documentation and samples. Here are the release notes:
http://www.mediafire.com/?h0o82v0jm2i8lqw
To my surprise, a couple of optimization flags are marked depreciated like the global optimization flag /Og.
Quick test shows that with the same flags and code icl 12 is produces 1-2% slower executable than version 11,
but the usage of other flags could fix that since i haven't read the papers yet.
p.s. I forgot to mention that the intel manuals are also included (the basic architecture one, and A-Z opcodes as well as the SSE4 reference and full profiling doc sets)
Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 6:44 pm
Location: Bulgaria

Re: For the Intel compiler experts

Post by Mincho Georgiev »

jshriver wrote:
Mincho Georgiev wrote: Start with something simple for pgo, like:
Excuse my ignorance, but what is pgo? It was mentioned elsewhere as well.
-Josh

Edit: nvm found it sounds neat
http://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler
Bob already point it out.
Some more information:

http://software.intel.com/sites/product ... go_ovw.htm

http://software.intel.com/sites/product ... o_bsic.htm
Mincho Georgiev
Posts: 454
Joined: Sat Apr 04, 2009 6:44 pm
Location: Bulgaria

Re: For the Intel compiler experts

Post by Mincho Georgiev »

The command line parameters:
http://www.mediafire.com/?rv4k4aax4xkt74t