bob wrote:hgm wrote:I don't think that reasoning is sound. I can build an engine that plays better Chess than I do. Why shouldn't I be able to write a compiler that makes better assembly than I do? Even if I would be the worlds's number one expert. We have engines that beat the World champ Chess...
Here's the biggest win you get, from someone that has been programming in ASM forever...
The compiler has to follow the semantics of the language. It doesn't know anything about the semantics of the program itself, other than what it sees. But that is not much.
For example, if you do a switch() it must, by standard, do what the specs say. If the switch variable value does not match any of the cases, the compiler is required to test for that condition to avoid errors. In my code, I might well KNOW that the "piece" value will only be one of 13 possibilities, and I can exclude that "if piece < -6 or piece > +6" sanity test. There are lots of such things that one can exploit, but it requires knowledge about the program that can't readily be discerned just looking at the source code for one procedure. Other things might include min and max range of a value where the compiler only knows "int" or "long" but you know it is 17 bits max. So there is room to out-perform the compiler, but nowhere near what it was 20-30 years ago when hand-coded asm was the way to go everywhere.
Compilers are good at doing the classic optimizations, like common-subexpression elimination, loop-invariant detection and such. So I doubt a 10:1 is possible, UNLESS one intentionally (or unknowingly) writes C code that ties the compiler's hands. Bad C code can still produce bad asm code. But with care, one can write C code that produces pretty well optimized asm code. Unless you step outside the semantics of the language somewhere. For example, there is no standard way to get to the popcnt() instruction. compilers do include intrinsics for such, but it is not something that is portable.
I've spent a lot of time looking at compiler output and am generally impressed. They can be improved on, with effort, and perhaps with a lot of pragma usage to tell the compiler things it can't spot for itself.
I appreciate that your knowledge of assembly is far better than mine, and probably almost everyone here.
However, the arguments you are making have little to do with Assembly vs C. For example, the switch argument. Let's say that your piece values range from 0 to 11. How about writing this instead?
Code: Select all
if (piece >= 12) goto skip;
switch (piece) {...};
skip:
The fact that 0 <= piece <= 11 can help to optimize the code is of course not known by a C compiler, nor is it known by an assembler...
In C++, with typed enum, you can let the compiler know the range of possible values
And then the switch knows that the set of possible values for a "Piece" is realatively compact so the compiler may decide to use a jump table, to make branching a O(1) operation.
Regarding the int/long, there are many integer types in C. It's all about choosing the right one, and knowing that sign extension is not a zero cost operation. If you use unsigned judiciously, there is some (tiny) performance gain there. But in reality, writing compact and cache friendly code is far more important than these microscopic optimizations.
What I am trying to say is that you will not make your code faster by using assembly, except to use hardware features otherwise unavailable in C (in the case of chess, I think there are only 4 functions that deserve assembly or intrinsics: popcnt(), lsb(), msb(), prefetch()).
But you will make your program faster by writing *better C* code. And in order to write efficient C code, some knowledge of assembly, and how a compiler/optimizer works certainly helps.
You can see that the Linux kernel has a very small %age of assembly code. It only uses assembly where it must (to do things not possible in C or to access hardware features like task switching port I/O etc.). Everything else is in C, even though Torvaldes is a performance freak, and a hardcore assembly programmer too.
And even if there is some performance gain that assembler gives you over *perfectly written* C code (which I doubt), it is certainly not worth doing. Imagine if your entire search() function was in assembly. It would be pages and pages long, and completely illegible. And you'd have to write plenty of versions of it for all sorts of different plateforms (and maintain them risking to have differences). A ridiculous waste of time, not to mention the fact that you'd have to constantly add new versions to support new CPU features etc.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.