| View previous topic :: View next topic |
| Author |
Message |
Martin Sedlak
Joined: 26 Nov 2010 Posts: 701
|
Post subject: Re: SSE4 instructions Posted: Thu May 24, 2012 12:46 pm |
|
|
| diep wrote: |
You also do the useless 'if( a )' clause in front of it?
Without it it's 2 cycles provided your results are in L1d, otherwise you have to get it out of L2 or L3 which is slow.
| Quote: |
Doing 8 256-byte lookups proved slower, and any other fancy formulas proved much slower (note i only tested in 32-bit mode).
|
it's 4 cycles yet lots of problems for L1d to keep up with the result,
also i see the above code uses a '+' (PLUS). You can also use | (OR) of course, which is faster on paper, except L1d problems again.
|
Yes I also have a zero-test before the lookups. I'm not a hardware expert and never optimized code to be cache-friendly. I always believed in high-level optimization, unless I was writing a software rasterizer or sample mixer.
Perhaps this obsession comes from the old times where we used to count instruction cycles because we had no cache at all
I don't claim anything, only report my old results.
I hope I tested directly time to depth in a certain position. If i used a huge loop with the same bitboard being fed to the popcount routine then that was certainly a lame test indeed.
PS. You can't use or instead of an addition. what if you have 0x10001. or would return popcnt 1 instead of correct 2. |
|
| Back to top |
|
 |
|
| Subject |
Author |
Date/Time |
SSE4 instructions |
Maurizio Maglio |
Mon May 21, 2012 8:48 pm |
Re: SSE4 instructions |
Adam Hair |
Mon May 21, 2012 10:47 pm |
Re: SSE4 instructions |
Maurizio Maglio |
Tue May 22, 2012 7:47 am |
Re: SSE4 instructions |
Joona Kiiski |
Tue May 22, 2012 1:55 pm |
Re: SSE4 instructions |
Engin Üstün |
Tue May 22, 2012 8:11 pm |
Re: SSE4 instructions |
Maurizio Maglio |
Tue May 22, 2012 8:41 pm |
Re: SSE4 instructions |
Robert Hyatt |
Wed May 23, 2012 3:39 pm |
Re: SSE4 instructions |
Engin Üstün |
Tue May 22, 2012 7:31 pm |
Re: SSE4 instructions |
Engin Üstün |
Tue May 22, 2012 7:34 pm |
Re: SSE4 instructions |
Ricardo Barreira |
Wed May 23, 2012 7:40 am |
Re: SSE4 instructions |
Robert Hyatt |
Wed May 23, 2012 3:37 pm |
Re: SSE4 instructions |
Richard Vida |
Wed May 23, 2012 5:28 pm |
Re: SSE4 instructions |
Robert Hyatt |
Wed May 23, 2012 5:47 pm |
Re: SSE4 instructions |
Ricardo Barreira |
Sat May 26, 2012 3:14 pm |
Re: SSE4 instructions |
Engin Üstün |
Wed May 23, 2012 10:02 pm |
Re: SSE4 instructions |
Vincent Diepeveen |
Thu May 24, 2012 12:02 am |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 3:54 pm |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 4:07 pm |
Re: SSE4 instructions |
Lucas Braesch |
Thu May 24, 2012 11:03 am |
Re: SSE4 instructions |
Martin Sedlak |
Thu May 24, 2012 11:47 am |
Re: SSE4 instructions |
Vincent Diepeveen |
Thu May 24, 2012 12:35 pm |
Re: SSE4 instructions |
Martin Sedlak |
Thu May 24, 2012 12:46 pm |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 4:24 pm |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 4:41 pm |
Re: SSE4 instructions |
Engin Üstün |
Wed May 23, 2012 4:35 pm |
Re: SSE4 instructions |
Robert Hyatt |
Wed May 23, 2012 5:48 pm |
Re: SSE4 instructions |
Engin Üstün |
Wed May 23, 2012 10:28 pm |
Re: SSE4 instructions |
Robert Hyatt |
Thu May 24, 2012 3:41 pm |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 4:31 pm |
Re: SSE4 instructions |
Engin Üstün |
Thu May 24, 2012 4:35 pm |
Re: SSE4 instructions |
Don Dailey |
Thu May 24, 2012 9:09 pm |
Re: SSE4 instructions |
Richard Vida |
Thu May 24, 2012 9:21 pm |
Re: SSE4 instructions |
Vincent Diepeveen |
Fri May 25, 2012 12:19 am |
Re: SSE4 instructions |
Engin Üstün |
Fri May 25, 2012 10:44 pm |
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|