Tricks for Compiling Sources?

bob · Post by **bob** » Thu Apr 21, 2011 8:39 pm

ldesnogu wrote:
bob wrote:It has always worked for me. What you are "learning" is how often each branch is taken or not taken. Then you make the most common case the "fall through" case and the less common case turns into a jump to the proper code. This makes cache usage a bit better because you avoid fetching stuff you are going to skip over most of the time... I usually see a 10-15% improvement. If you see none, or even things get worse, that suggests your "training" is wrong. I use 25 or so positions, searched for 10+ seconds per position, and I carefully chose opening, middlegame and endgame positions, and even one heavy-hitter EGTB position to train it on egtb execution...
Branch prediction was exactly what made me wonder about the applicability of PGO: provide an input set where a given branch does not the exhibit the properties of real workload and you could end with a program running slower.

Thanks for explaining how you do your training.

Picking the test data is much more important than any options you fiddle with inside the PGO option set. If it isn't representative (the test data) then the PGO results will be poor. And note that PGO will optimize across the set of positions, but that some are likely orthogonal changes. For example, opening positions will jump one way on passed pawn tests, while endgames will jump the opposite way. Which one is the right way? Has to be the most common case, and in my way of thinking, opening/middlegame has to be most important, because otherwise you reach the endgame phase already lost...

bob · Post by **bob** » Thu Apr 21, 2011 8:43 pm

Dann Corbit wrote:
Kempelen wrote:
Dann Corbit wrote:
Jim Ablett wrote:
Kempelen wrote:Jim,

Has you tried MINGW under windows? What is its performance?

thanks
Hi Fermin,

Not so good under Windows. Compiles are around 20% slower than comparable Intel/Msvc ones.

Jim.
With the very latest version, I have found excellent performance. With PGO, it even outperformed Intel with PGO.

I am sure that there are lots of variables that go into this equation, of course.
Does anybody knows how pgo exactly works? I am an ignorant in this field
The idea of PGO is very simple. If we run the program, we can see what code paths are exercised and make better branch predictions from statistics.

For example:
if (in_check()) generate_evade_capture_or_blocks();
else generate_normal_moves();

Checks are less common than normal moves so we can predict that it would be better to branch to generate_normal_moves(). But the compiler does not know this so it does not know which code path to predict. The solution is to instrument the program and run it. As we run the program, it captures statistics about what the code is really doing. Because of this, the optimizer can make better choices.

Typically PGO has three phases:
1. Compile with instrumentation
2. Run the program over typical usage patterns (e.g. run 10 games, run some EPD endgame problems, run some EPD normal problems, run some EPD quiet positions -- at least that is what I do)
3. Compile with option to use the instrumentation data.

All of the major compilers have PGO now. That includes Microsoft, GCC, and Intel, as well as others.

Just to quibble with your terminology, PGO really isn't about "branch prediction" as much as it is about trying to create a "straight-line" path thru the code, for the most common execution case. That fills cache with the most useful data, without fetching stuff you are going to skip over, to take max advantage of the 64 byte block fetches.

Branch prediction is an internal issue that the compiler can't do much about at all, other than to change je to jne and rearranging the code, if that is more efficient.

Tricks for Compiling Sources?

Re: Tricks for Compiling Sources?

Re: Tricks for Compiling Sources?