Michel wrote:Hard to say. In my experience so-called gcc bugs are usually the result of subtly broken code.
That is why I am not 100% relaxed.
Did you compile with -Wall? Did you try -std=c99? Did you try the Intel compiler?
Did you try to run your program under valgrind?
Regards,
Michel
I have -Wall and -std=c99 and it compiles cleanly. I have asserts all over the place and they are quiet with this.
I did not try another compiler yet to check this problem. I do not have the intel compiler installed yet, but I think this will give me an excuse to finally download it. I will try to compile it in windows, because the code is portable. I will also try valgrind. I know I should try all this, but my message was somehow cathartic.
Still, I cannot imagine a reason for this behavior. Why introducing a printf() in a speficic spot can cause to have a dozen or so more nodes counted in a 10 ply search? I have some sort of obsesive compulsive disorder when it comes to bugs and I can't stand it.
Miguel
(1) do you compile as one giant source file so that the compiler can do the necessary dependency analysis to detect most unitialized data issues?
No, but I will try. I think there is a GCC switch that does that. -fwhole-program with -combine does something like that, right?
Thanks,
Miguel
(2) this can also be a nasty pointer issue as well. Commonly caused by a bad array subscript which compilers have no clue about detecting.
I don't use gcc enough to remember. I've been using Intel's compiler for years since it is free for linux and produces faster executables as well. And Intel's profile-guided-optimization has always worked for me, even when profiling with multiple threads, whereas gcc often crashed and burned.
glorfindel wrote:It is really a relatively old version of gcc that you are using. GCC 4.1 was released in February 2006 anf the last bugfix release (4.1.2) in February 2007. Are you sure you can't easily install and try a newer version, because at http://packages.ubuntu.com/intrepid/gcc it says 4.3.1 is available.
Apart from that, I have read in my linux distribution's manual that
# -O3: This is the highest level of optimization possible, and also the riskiest. It will take a longer time to compile your code with this option, and in fact it should not be used system-wide with gcc 4.x. The behavior of gcc has changed significantly since version 3.x. In 3.x, -O3 has been shown to lead to marginally faster execution times over -O2, but this is no longer the case with gcc 4.x. Compiling all your packages with -O3 will result in larger binaries that require more memory, and will significantly increase the odds of compilation failure or unexpected program behavior (including errors). The downsides outweigh the benefits; remember the principle of diminishing returns. Using -O3 is not recommended for gcc 4.x.
Optimization -O3 is described as "the riskiest". I am not sure if this is supposed to be because of bugs in gcc, or because the programmer must be very careful to write perfect code.
I just tested compiling with GCC 4.3.1 and the discrepancy disappeared (inlining vs not inlining --> give same node count).
I would later try the microsoft compiler in windows and the intel in linux, but it is possible that it was a bug in GCC 4.1.2
I am not entirely happy because I found now another discrepancy, between compiling in ubuntu 64 bits, vs ubuntu 32 bits. This is more understandable, because sizeof(long int) is 8 in the 64 bits compile rather than 4 in a 32 bits compile. That may be screwing something somewhere. At least, this difference is independent of the optimization flags. That means I may be able to track it down in DEBUG mode, dumping trees and comparing.
The good news is that 64 bits OS give me a ~60 - 70% speed up. At least, this makes this whole ordeal worthy.
glorfindel wrote:It is really a relatively old version of gcc that you are using. GCC 4.1 was released in February 2006 anf the last bugfix release (4.1.2) in February 2007. Are you sure you can't easily install and try a newer version, because at http://packages.ubuntu.com/intrepid/gcc it says 4.3.1 is available.
Apart from that, I have read in my linux distribution's manual that
# -O3: This is the highest level of optimization possible, and also the riskiest. It will take a longer time to compile your code with this option, and in fact it should not be used system-wide with gcc 4.x. The behavior of gcc has changed significantly since version 3.x. In 3.x, -O3 has been shown to lead to marginally faster execution times over -O2, but this is no longer the case with gcc 4.x. Compiling all your packages with -O3 will result in larger binaries that require more memory, and will significantly increase the odds of compilation failure or unexpected program behavior (including errors). The downsides outweigh the benefits; remember the principle of diminishing returns. Using -O3 is not recommended for gcc 4.x.
Optimization -O3 is described as "the riskiest". I am not sure if this is supposed to be because of bugs in gcc, or because the programmer must be very careful to write perfect code.
I just tested compiling with GCC 4.3.1 and the discrepancy disappeared (inlining vs not inlining --> give same node count).
I would later try the microsoft compiler in windows and the intel in linux, but it is possible that it was a bug in GCC 4.1.2
I am not entirely happy because I found now another discrepancy, between compiling in ubuntu 64 bits, vs ubuntu 32 bits. This is more understandable, because sizeof(long int) is 8 in the 64 bits compile rather than 4 in a 32 bits compile. That may be screwing something somewhere. At least, this difference is independent of the optimization flags. That means I may be able to track it down in DEBUG mode, dumping trees and comparing.
Ok, I found the problem and fixed it. I have one table that hashes some 32 bit information. It was "long int". Compiling with the 64 bit GCC, long int becomes 64 bits. That creates a bigger entry size, and as a consequence, fewer entries for the same memory. Having fewer entries caused that at one point in the tree, there was a miss, resulting in a different node count. It was not actually broken code, it was just unnecessary hungry for more memory.
Changing this to uint32_t gives same node count for both, 32 and 64 compile.
I have one less stone in my shoe. Thanks.
Miguel
The good news is that 64 bits OS give me a ~60 - 70% speed up. At least, this makes this whole ordeal worthy.