PREFETCH vs POPCNT vs ...

bob · Post by **bob** » Thu Jan 14, 2010 1:28 am

Milos wrote:
bob wrote:I am not sure how a compiler is going to use POPCNT, or what it is supposed to do, since there is no C semantic for doing that. Which means those of us doing a popcnt operation use some sort of table lookup idea, or else inline asm based on the A & A-1 trick. You'd have to either use inline asm and use a popcnt instruction, or else use the MSCV popcnt intrinsic which is not compatible with other compilers.
The story about popcnt is quite simple.
If your processor supports SSE4a use hardware one. If it doesn't unless your bitboards are really sparse (when you use assemble optimized B&(B-1)) use the famous one from:
http://www.amd.com/us-en/assets/content ... /25112.PDF.
Everything else (especially compiler ones) is slower in practice for chess implementations...

That's not my point. I use popcnt() in Crafty. The question is, how does some sort of compiler flag cause popcnt hardware instruction to show up, when there is no hint to the compiler inside the source program that the "loop" being used is doing a population count? If there was some sort of semantic construct that does popcnt in C, then the compiler could choose a table, a loop, or a popcnt instruction to implement the thing, but there isn't. So how does telling the compiler you have a hardware POPCNT (such as on i7) going to make its way into your executable???

kingliveson · Post by **kingliveson** » Thu Jan 14, 2010 1:51 am

bob wrote:
Look at the "interesting test results" thread I started. Things can change significantly even after thousands of games. The data in that post is pretty easy to follow.

The thread in question. No doubt, the more cycles the more the accuracy with such a test. Resources factor in also on how one carries out an experiment. Unless we get a volunteer...

PREFETCH vs POPCNT vs ...

Re: PREFETCH vs POPCNT vs ...

Re: PREFETCH vs POPCNT vs ...