Stockfish PGO and popcnt

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:
zullil wrote:No need for the snarky "thanks for wasting our time" either.
Well, you said you were "summarizing", not that you were trying another approach.

I still don't see how it can fail when using the __asm__ implementation of popcount.

What you could try is to include a line #warning USING POPCNT in bitcount.h, say between lines 99 and 100. If during compilation you don't see USING POPCNT warnings, then somehow the __asm__ is not being used.
OK, I was unclear.

Thanks for the suggestion. Will try it and see.
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Completely baffling

Post by syzygy »

What you could also try as a last resort is removing all alternative implementations from bitcount.h:

Code: Select all

/*
  Stockfish, a UCI chess playing engine derived from Glaurung 2.1
  Copyright (C) 2004-2008 Tord Romstad (Glaurung author)
  Copyright (C) 2008-2013 Marco Costalba, Joona Kiiski, Tord Romstad

  Stockfish is free software: you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  (at your option) any later version.


  Stockfish is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see <http&#58;//www.gnu.org/licenses/>.
*/

#if !defined&#40;BITCOUNT_H_INCLUDED&#41;
#define BITCOUNT_H_INCLUDED

#include <cassert>
#include "types.h"

enum BitCountType &#123;
  CNT_64,
  CNT_64_MAX15,
  CNT_32,
  CNT_32_MAX15,
  CNT_HW_POPCNT
&#125;;

/// Determine at compile time the best popcount<> specialization according if
/// platform is 32 or 64 bits, to the maximum number of nonzero bits to count
/// and if hardware popcnt instruction is available.
const BitCountType Full  = CNT_HW_POPCNT;
const BitCountType Max15 = CNT_HW_POPCNT;

/// popcount&#40;) counts the number of nonzero bits in a bitboard
template<BitCountType type>
inline int popcount&#40;Bitboard b&#41; &#123;
  __asm__("popcnt %1, %0" &#58; "=r" &#40;b&#41; &#58; "r" &#40;b&#41;);
  return b;
&#125;

#endif // !defined&#40;BITCOUNT_H_INCLUDED&#41;
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!

Code: Select all

LZsMacPro-OSX6&#58; ~/Documents/Chess/Stockfish/src&#93; make profile-build ARCH=osx-x86-64
make ARCH=osx-x86-64 COMP=gcc config-sanity

Config&#58;
debug&#58; 'no'
optimize&#58; 'yes'
arch&#58; 'x86_64'
os&#58; 'osx'
bits&#58; '64'
prefetch&#58; 'yes'
bsfq&#58; 'yes'
popcnt&#58; 'yes'
sse&#58; 'yes'

Flags&#58;
CXX&#58; g++
CXXFLAGS&#58; -Wall -Wcast-qual -fno-exceptions -fno-rtti  -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT 
LDFLAGS&#58;  -lpthread  -Wall -Wcast-qual -fno-exceptions -fno-rtti  -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT 

Testing config sanity. If this fails, try 'make help' ...


Step 0/4. Preparing for profile build.
make ARCH=osx-x86-64 COMP=gcc gcc-profile-prepare
make ARCH=osx-x86-64 COMP=gcc gcc-profile-clean

Step 1/4. Building executable for benchmark ...
make ARCH=osx-x86-64 COMP=gcc gcc-profile-make
make ARCH=osx-x86-64 COMP=gcc \
	EXTRACXXFLAGS='-fprofile-generate' \
	EXTRALDFLAGS='-lgcov' \
	all
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o benchmark.o benchmark.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o bitbase.o bitbase.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o bitboard.o bitboard.cpp
In file included from bitboard.cpp&#58;25&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o book.o book.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o endgame.o endgame.cpp
In file included from endgame.cpp&#58;24&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o evaluate.o evaluate.cpp
In file included from evaluate.cpp&#58;25&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o main.o main.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o material.o material.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o misc.o misc.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o movegen.o movegen.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o movepick.o movepick.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o notation.o notation.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o pawns.o pawns.cpp
In file included from pawns.cpp&#58;24&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o position.o position.cpp
In file included from position.cpp&#58;27&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o search.o search.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o thread.o thread.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o timeman.o timeman.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o tt.o tt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o uci.o uci.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o ucioption.o ucioption.cpp
g++ -o stockfish benchmark.o bitbase.o bitboard.o book.o endgame.o evaluate.o main.o material.o misc.o movegen.o movepick.o notation.o pawns.o position.o search.o thread.o timeman.o tt.o uci.o ucioption.o -lgcov -lpthread  -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-generate -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT 

Step 2/4. Running benchmark for pgo-build ...

Position&#58; 1/16

Position&#58; 2/16

Position&#58; 3/16

Position&#58; 4/16

Position&#58; 5/16

Position&#58; 6/16

Position&#58; 7/16

Position&#58; 8/16

Position&#58; 9/16

Position&#58; 10/16

Position&#58; 11/16

Position&#58; 12/16

Position&#58; 13/16

Position&#58; 14/16

Position&#58; 15/16

Position&#58; 16/16

===========================
Total time &#40;ms&#41; &#58; 27141
Nodes searched  &#58; 20189706
Nodes/second    &#58; 743882

Step 3/4. Building final executable ...
make ARCH=osx-x86-64 COMP=gcc gcc-profile-use
make ARCH=osx-x86-64 COMP=gcc \
	EXTRACXXFLAGS='-fprofile-use' \
	EXTRALDFLAGS='-lgcov' \
	all
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o benchmark.o benchmark.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o bitbase.o bitbase.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o bitboard.o bitboard.cpp
In file included from bitboard.cpp&#58;25&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o book.o book.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o endgame.o endgame.cpp
In file included from endgame.cpp&#58;24&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o evaluate.o evaluate.cpp
In file included from evaluate.cpp&#58;25&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o main.o main.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o material.o material.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o misc.o misc.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o movegen.o movegen.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o movepick.o movepick.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o notation.o notation.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o pawns.o pawns.cpp
In file included from pawns.cpp&#58;24&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o position.o position.cpp
In file included from position.cpp&#58;27&#58;0&#58;
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning is a GCC extension &#91;enabled by default&#93;
 #warning USING POPCNT
  ^
bitcount.h&#58;100&#58;2&#58; warning&#58; #warning USING POPCNT &#91;-Wcpp&#93;
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o search.o search.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o thread.o thread.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o timeman.o timeman.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o tt.o tt.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o uci.o uci.cpp
g++ -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT    -c -o ucioption.o ucioption.cpp
g++ -o stockfish benchmark.o bitbase.o bitboard.o book.o endgame.o evaluate.o main.o material.o misc.o movegen.o movepick.o notation.o pawns.o position.o search.o thread.o timeman.o tt.o uci.o ucioption.o -lgcov -lpthread  -Wall -Wcast-qual -fno-exceptions -fno-rtti -fprofile-use -ansi -pedantic -Wno-long-long -Wextra -Wshadow  -DNDEBUG -O3 -fno-tree-pre -DIS_64BIT -msse -DUSE_BSFQ -msse3 -DUSE_POPCNT 

Step 4/4. Deleting profile data ...
make ARCH=osx-x86-64 COMP=gcc gcc-profile-clean
LZsMacPro-OSX6&#58; ~/Documents/Chess/Stockfish/src&#93; otool -tvQ stockfish | grep popcnt
LZsMacPro-OSX6&#58; ~/Documents/Chess/Stockfish/src&#93; 
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:What you could also try as a last resort is removing all alternative implementations from bitcount.h:

Code: Select all

/*
  Stockfish, a UCI chess playing engine derived from Glaurung 2.1
  Copyright &#40;C&#41; 2004-2008 Tord Romstad &#40;Glaurung author&#41;
  Copyright &#40;C&#41; 2008-2013 Marco Costalba, Joona Kiiski, Tord Romstad

  Stockfish is free software&#58; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  &#40;at your option&#41; any later version.


  Stockfish is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see <http&#58;//www.gnu.org/licenses/>.
*/

#if !defined&#40;BITCOUNT_H_INCLUDED&#41;
#define BITCOUNT_H_INCLUDED

#include <cassert>
#include "types.h"

enum BitCountType &#123;
  CNT_64,
  CNT_64_MAX15,
  CNT_32,
  CNT_32_MAX15,
  CNT_HW_POPCNT
&#125;;

/// Determine at compile time the best popcount<> specialization according if
/// platform is 32 or 64 bits, to the maximum number of nonzero bits to count
/// and if hardware popcnt instruction is available.
const BitCountType Full  = CNT_HW_POPCNT;
const BitCountType Max15 = CNT_HW_POPCNT;

/// popcount&#40;) counts the number of nonzero bits in a bitboard
template<BitCountType type>
inline int popcount&#40;Bitboard b&#41; &#123;
  __asm__("popcnt %1, %0" &#58; "=r" &#40;b&#41; &#58; "r" &#40;b&#41;);
  return b;
&#125;

#endif // !defined&#40;BITCOUNT_H_INCLUDED&#41;
Yes, I tried that too. :D
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Completely baffling

Post by syzygy »

zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

On my (linux) system the equivalent tool is objdump. Maybe you have that too and you could try objdump -d ./stockfish | grep popcnt?
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:
zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

On my (linux) system the equivalent tool is objdump. Maybe you have that too and you could try objdump -d ./stockfish | grep popcnt?
Yes, I considered doing that. But I did search the binary with a hex editor and couldn't locate the popcnt opcodes. (I had no problem locating them in the non-pgo'd binary.) Still mysterious.

Thanks for the help.
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Completely baffling

Post by syzygy »

zullil wrote:
syzygy wrote:What you could also try as a last resort is removing all alternative implementations from bitcount.h:

Code: Select all

/*
  Stockfish, a UCI chess playing engine derived from Glaurung 2.1
  Copyright &#40;C&#41; 2004-2008 Tord Romstad &#40;Glaurung author&#41;
  Copyright &#40;C&#41; 2008-2013 Marco Costalba, Joona Kiiski, Tord Romstad

  Stockfish is free software&#58; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  &#40;at your option&#41; any later version.


  Stockfish is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see <http&#58;//www.gnu.org/licenses/>.
*/

#if !defined&#40;BITCOUNT_H_INCLUDED&#41;
#define BITCOUNT_H_INCLUDED

#include <cassert>
#include "types.h"

enum BitCountType &#123;
  CNT_64,
  CNT_64_MAX15,
  CNT_32,
  CNT_32_MAX15,
  CNT_HW_POPCNT
&#125;;

/// Determine at compile time the best popcount<> specialization according if
/// platform is 32 or 64 bits, to the maximum number of nonzero bits to count
/// and if hardware popcnt instruction is available.
const BitCountType Full  = CNT_HW_POPCNT;
const BitCountType Max15 = CNT_HW_POPCNT;

/// popcount&#40;) counts the number of nonzero bits in a bitboard
template<BitCountType type>
inline int popcount&#40;Bitboard b&#41; &#123;
  __asm__("popcnt %1, %0" &#58; "=r" &#40;b&#41; &#58; "r" &#40;b&#41;);
  return b;
&#125;

#endif // !defined&#40;BITCOUNT_H_INCLUDED&#41;
Yes, I tried that too. :D
If you've tried this and the executable worked correctly, then it must have the popcnt instruction.

Just for fun you could comment out the __asm__ line and confirm that stockfish crashes.
zullil
Posts: 6442
Joined: Tue Jan 09, 2007 12:31 am
Location: PA USA
Full name: Louis Zulli

Re: Completely baffling

Post by zullil »

syzygy wrote:
zullil wrote:
syzygy wrote:What you could also try as a last resort is removing all alternative implementations from bitcount.h:

Code: Select all

/*
  Stockfish, a UCI chess playing engine derived from Glaurung 2.1
  Copyright &#40;C&#41; 2004-2008 Tord Romstad &#40;Glaurung author&#41;
  Copyright &#40;C&#41; 2008-2013 Marco Costalba, Joona Kiiski, Tord Romstad

  Stockfish is free software&#58; you can redistribute it and/or modify
  it under the terms of the GNU General Public License as published by
  the Free Software Foundation, either version 3 of the License, or
  &#40;at your option&#41; any later version.


  Stockfish is distributed in the hope that it will be useful,
  but WITHOUT ANY WARRANTY; without even the implied warranty of
  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  GNU General Public License for more details.

  You should have received a copy of the GNU General Public License
  along with this program.  If not, see <http&#58;//www.gnu.org/licenses/>.
*/

#if !defined&#40;BITCOUNT_H_INCLUDED&#41;
#define BITCOUNT_H_INCLUDED

#include <cassert>
#include "types.h"

enum BitCountType &#123;
  CNT_64,
  CNT_64_MAX15,
  CNT_32,
  CNT_32_MAX15,
  CNT_HW_POPCNT
&#125;;

/// Determine at compile time the best popcount<> specialization according if
/// platform is 32 or 64 bits, to the maximum number of nonzero bits to count
/// and if hardware popcnt instruction is available.
const BitCountType Full  = CNT_HW_POPCNT;
const BitCountType Max15 = CNT_HW_POPCNT;

/// popcount&#40;) counts the number of nonzero bits in a bitboard
template<BitCountType type>
inline int popcount&#40;Bitboard b&#41; &#123;
  __asm__("popcnt %1, %0" &#58; "=r" &#40;b&#41; &#58; "r" &#40;b&#41;);
  return b;
&#125;

#endif // !defined&#40;BITCOUNT_H_INCLUDED&#41;
Yes, I tried that too. :D
If you've tried this and the executable worked correctly, then it must have the popcnt instruction.
Yes, I knew it had to be there. But I couldn't/can't find it! I even ran the binary on my core 2 duo Macbook, which complained about an "illegal instruction", since that cpu doesn't have popcnt.
syzygy
Posts: 5554
Joined: Tue Feb 28, 2012 11:56 pm

Re: Completely baffling

Post by syzygy »

To see where the popcnt instructions go, let's insert one in a place where we can easily find it back.

Add a few lines to main.c:

Code: Select all

(...)
#include "bitcount.h"

int main&#40;int argc, char* argv&#91;&#93;) &#123;

  std&#58;&#58;cout << "popcount&#40;123456&#41; = " << popcount<Full>&#40;123456&#41; << std&#58;&#58;endl;

  std&#58;&#58;cout << engine_info&#40;) << std&#58;&#58;endl;

(...)
Now compile and dump the output of otool to a text file (otool stockfish >assembly.txt).
Load the text file into an editor and look either for main or for $0x1e240 (which is 123456):

Code: Select all

000000000040b820 <main>&#58;
  40b820&#58;	55                   	push   %rbp
  40b821&#58;	48 89 e5             	mov    %rsp,%rbp
  40b824&#58;	41 57                	push   %r15
  40b826&#58;	41 56                	push   %r14
  40b828&#58;	41 55                	push   %r13
  40b82a&#58;	41 54                	push   %r12
  40b82c&#58;	53                   	push   %rbx
  40b82d&#58;	bb 40 e2 01 00       	mov    $0x1e240,%ebx
  40b832&#58;	f3 48 0f b8 db       	popcnt %rbx,%rbx
This is from objdump, so with otool it will look a bit different, but it shouldn't be hard to locate this part.

With make profile-build I get (gcc 4.8.1, on Linux):

Code: Select all

000000000040b1a0 <main>&#58;
  40b1a0&#58;	55                   	push   %rbp
  40b1a1&#58;	48 89 e5             	mov    %rsp,%rbp
  40b1a4&#58;	41 57                	push   %r15
  40b1a6&#58;	41 56                	push   %r14
  40b1a8&#58;	41 55                	push   %r13
  40b1aa&#58;	41 54                	push   %r12
  40b1ac&#58;	53                   	push   %rbx
  40b1ad&#58;	bb 40 e2 01 00       	mov    $0x1e240,%ebx
  40b1b2&#58;	f3 4c 0f b8 e3       	popcnt %rbx,%r12
So practically the same.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Completely baffling

Post by bob »

syzygy wrote:
zullil wrote:It appears that popcnt is being used. The question now becomes why can't I find a popcnt instruction or opcode in the binary!
My only explanation is that otool is somehow not seeing the code sections that have the popcnt instructions. It may not be a very good explanation, but I don't see any other.

On my (linux) system the equivalent tool is objdump. Maybe you have that too and you could try objdump -d ./stockfish | grep popcnt?
Or you can manually type the gcc compile command (including prof_use) and adding the -s and -c flags. Then you should get a formatted asm file where you can see if there are any popcnt's to be found...