Stockfish PGO and popcnt

Discussion of chess software programming and technical issues.

Moderator: Ras

syzygy
Posts: 5895
Joined: Tue Feb 28, 2012 11:56 pm

Re: Completely baffling

Post by syzygy »

bob wrote:
syzygy wrote:
zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

On my (linux) system the equivalent tool is objdump. Maybe you have that too and you could try objdump -d ./stockfish | grep popcnt?
Or you can manually type the gcc compile command (including prof_use) and adding the -s and -c flags. Then you should get a formatted asm file where you can see if there are any popcnt's to be found...
But I doubt that the output of -s is equivalent to the binary compiled and linked with -flto.

However, Louis has already stated that disabling LTO does not change the "disappearance" of popcnt, so it is worth a try. For example on evaluate.cpp. Of course I don't see how popcnt could not be there...
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:
bob wrote:
syzygy wrote:
zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

On my (linux) system the equivalent tool is objdump. Maybe you have that too and you could try objdump -d ./stockfish | grep popcnt?
Or you can manually type the gcc compile command (including prof_use) and adding the -s and -c flags. Then you should get a formatted asm file where you can see if there are any popcnt's to be found...
But I doubt that the output of -s is equivalent to the binary compiled and linked with -flto.

However, Louis has already stated that disabling LTO does not change the "disappearance" of popcnt, so it is worth a try. For example on evaluate.cpp. Of course I don't see how popcnt could not be there...
OK, I've found the missing popcnt instructions. I've learned that the executable (a mach-o file) contains several "text" sections containing assembled code. Apparently, otool -tVQ only disassembles the __TEXT,__text section. Other sections can be disassembled using the -s flag. In short, mach-o files are now more complicated than I realized. Not sure if something has changed recently.

In any case, here are some of the popcnt instructions that really are in the pgo'd binary:

Code: Select all

LZsMacPro-OSX6: ~/Documents/Chess/Stockfish/src] otool -VQ -s __TEXT __text_cold stockfish | grep popcnt
0000000100011d31	popcnt	%r9,%r10
0000000100012961	popcnt	%rsi,%rax
00000001000129b3	popcnt	%rsi,%rax
0000000100012bc5	popcnt	%r9,%rax
0000000100012cf9	popcnt	%r15,%rdx
0000000100012d0d	popcnt	%r14,%r10
0000000100012f1f	popcnt	%rsi,%rcx
0000000100012f41	popcnt	%rax,%rax
0000000100013159	popcnt	%r9,%r10
0000000100013175	popcnt	%rax,%r11
00000001000131ea	popcnt	%r9,%rcx
00000001000132b8	popcnt	%r8,%rsi
00000001000132d3	popcnt	%rax,%rax
0000000100013350	popcnt	%rsi,%rdi
000000010001348e	popcnt	%r15,%r15
00000001000134a2	popcnt	%rax,%rax
000000010001369b	popcnt	%rdi,%rdx
00000001000136ba	popcnt	%rax,%rax
00000001000138ef	popcnt	%rdx,%rsi
0000000100013919	popcnt	%rax,%rax
0000000100013975	popcnt	%rcx,%r11
0000000100013a5a	popcnt	%rdi,%r11
0000000100013a85	popcnt	%rax,%rax
0000000100013ae5	popcnt	%rax,%r10
0000000100013c5a	popcnt	%r10,%r11
0000000100013cb4	popcnt	%r12,%r8
0000000100013d26	popcnt	%r11,%r10
0000000100013d9d	popcnt	%r15,%r15
0000000100013db5	popcnt	%rcx,%r11
0000000100013dd5	popcnt	%rax,%rdi
0000000100013dfa	popcnt	%r14,%r14
0000000100013f6f	popcnt	%rsi,%r9
0000000100013fd9	popcnt	%r12,%r9
0000000100014049	popcnt	%rax,%r9
00000001000140ca	popcnt	%r15,%rsi
00000001000140e3	popcnt	%rdi,%rcx
0000000100014101	popcnt	%rax,%rax
0000000100014126	popcnt	%r14,%r14
0000000100014bb9	popcnt	%r10,%rax
OK, now I'll be able to sleep. Thanks for all the help. Really thought I was losing my mind. Learned a lot in any case.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:
zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

Exactly right, as it turns out!