DiscoCheck 4.0.0 (release candidate)

lucasart · Post by **lucasart** » Sat Feb 02, 2013 3:19 am

ZirconiumX wrote:
lucasart wrote:
ZirconiumX wrote: In all of my tests, Clang has performed FASTER than GCC.
Well it looks like the new version GCC 4.8 is a huge improvement then, as it beats the pants of Clang 3.2:
http://www.phoronix.com/scan.php?page=a ... svn1&num=1
Also they've made a lot of effort to make GCC produce understandable error messages. This is especially important in C++
http://gcc.gnu.org/wiki/ClangDiagnosticsComparison
So yes, Clang is moving fast. But GCC is not dead yet. It is still the King of compilers for a while.
The link you quote contains a build of Crafty v23.4 - where Clang takes the lead. GCC may be best for other things, though.

Matthew:out

One thing that I would like to see compilers improve is handling bitfields efficiently. I had to write this ugly code, in order to manipulate a 16-bit move_t structure efficiently (speed gain was +60% in GCC on a raw perft just by replacing compiler managed bitfields and manually hacked ones)

Code: Select all

class move_t
{
        /* 16 bit field:
         * fsq = 0..5
         * tsq = 6..11
         * prom = 12,13 (0=Knight..3=Queen)
         * flag = 14,15 (0=NORMAL...3=CASTLING)
         * */
        uint16_t b;

public:
        move_t(): b(0) {}       // silence compiler warnings
        move_t(short _b): b(_b) {}
        operator bool() const { return b; }
        
        bool operator== (move_t m) const { return b == m.b; }
        bool operator!= (move_t m) const { return b != m.b; }

        // getters
        int fsq() const { return b & 0x3f; }
        int tsq() const { return (b >> 6) & 0x3f; }
        int flag() const { return (b >> 14) & 3; }
        int prom() const { assert(flag() == PROMOTION); return ((b >> 12) & 3) + KNIGHT; }

        // setters
        void fsq(int fsq) { assert(square_ok(fsq)); b &= 0xffc0; b ^= fsq; }
        void tsq(int tsq) { assert(square_ok(tsq)); b &= 0xf03f; b ^= (tsq << 6); }
        void flag(int flag) { assert(flag < 4); b &= 0x3fff; b ^= (flag << 14); }
        void prom(int piece) { assert(KNIGHT <= piece && piece <= QUEEN); b &= 0xcfff; b ^= (piece - KNIGHT) << 12; }
};

People shouldn't have to write code like this if the compiler did his job properly. Other have ranted about this GCC inefficiency, including Linus Torvaldes himself. And it still has not been adress in GCC 4.7 (don't know about GCC 4.8).

PS: Robert Hyatt can confirm, but I think Crafty handles bitfields manually, at least where it is performance sensitive. For the same reason. So we wouldn't see that in the Phoronix test of Crafty.

lucasart · Post by **lucasart** » Sat Feb 02, 2013 3:29 am

Jim Ablett wrote:
Code: Select all
int count_bit_max15(Bitboard b)
{
	return __builtin_popcountll(b);
}

I will commit this patch to my github repo, so that people don't have to hack the code to enable popcount.

Now I understand: it's a one code fits all situation and just an extra GCC compilation flag enables hardware support of __builtin_popcountll. I was afraid that the code would only compile on machines that have the hardware support, and I would have to use some ugly #ifded and code two different versions.

lucasart · Post by **lucasart** » Sat Feb 02, 2013 3:36 am

lucasart wrote:Other have ranted about this GCC inefficiency, including Linus Torvaldes himself.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48696
Not clear how the discussion ended. They probably convinced Torvaldes that "it's not a bug it's a feature", and he got fed up with arguing

Jim Ablett · Post by **Jim Ablett** » Sat Feb 02, 2013 11:43 am

lucasart wrote:
Jim Ablett wrote:
Code: Select all
int count_bit_max15(Bitboard b)
{
	return __builtin_popcountll(b);
}
I will commit this patch to my github repo, so that people don't have to hack the code to enable popcount.

Now I understand: it's a one code fits all situation and just an extra GCC compilation flag enables hardware support of __builtin_popcountll. I was afraid that the code would only compile on machines that have the hardware support, and I would have to use some ugly #ifded and code two different versions.

One thing though. After testing I found the software fall-back mode of the __builtin_popcountll is a lot slower than the one you were already using.

Jim.

Evert · Post by **Evert** » Sat Feb 02, 2013 1:48 pm

lucasart wrote:One thing that I would like to see compilers improve is handling bitfields efficiently. I had to write this ugly code, in order to manipulate a 16-bit move_t structure efficiently (speed gain was +60% in GCC on a raw perft just by replacing compiler managed bitfields and manually hacked ones)

I had a similar experience with Jazz. I don't remember how much I gained by going from bitfields to manual shifts, I think it was less than 60%, but doing it by hand was clearly much better.

DiscoCheck 4.0.0 (release candidate)

Re: DiscoCheck 4.0.0 (release candidate)

Re: DiscoCheck 4.0.0 (release candidate)

Re: DiscoCheck 4.0.0 (release candidate)

Re: DiscoCheck 4.0.0 (release candidate)

Re: DiscoCheck 4.0.0 (release candidate)