Did any of you developers use a static code analysis tool to take a look at your code and see if there were any areas where you could optimize it? I am not super happy with my move generator times compared to some of the more stronger engines out there and am trying to determine the best method of analyzing where there might be a bottleneck, etc.
Any thoughts or consideration in this matter is greatly appreciated.
Thanks
Jeff
Static Code Analysis
Moderators: hgm, Rebel, chrisw
-
- Posts: 83
- Joined: Mon Mar 24, 2014 12:26 am
- Location: Glen Carbon, IL USA
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Static Code Analysis
Static analysis is used to detect bugs, if you want to improve performance you want profiling tools instead.
As for move generator, I wouldn't worry much about it (unless it's a factor of 100
Of course being faster here or there is always good as it accumulates in the end.
Mine is 2+ times slower than top engines for example. You will only spend a fraction of total time generating moves.
Real work has to be done elsewhere to make an engine strong.
As for move generator, I wouldn't worry much about it (unless it's a factor of 100
Of course being faster here or there is always good as it accumulates in the end.
Mine is 2+ times slower than top engines for example. You will only spend a fraction of total time generating moves.
Real work has to be done elsewhere to make an engine strong.
-
- Posts: 83
- Joined: Mon Mar 24, 2014 12:26 am
- Location: Glen Carbon, IL USA
Re: Static Code Analysis
I stand corrected, profiling tools are what I am asking about, thank you Martin.mar wrote:Static analysis is used to detect bugs, if you want to improve performance you want profiling tools instead.
As for move generator, I wouldn't worry much about it (unless it's a factor of 100
Of course being faster here or there is always good as it accumulates in the end.
Mine is 2+ times slower than top engines for example. You will only spend a fraction of total time generating moves.
Real work has to be done elsewhere to make an engine strong.
Basically wanting to know if I am having bottlenecks and slowdowns in loops, etc...
Right now my engine is taking about 10 seconds to get to perft 6 on FEN entries that Stockfish and other engines are hitting in about 2-3 seconds.
-
- Posts: 2559
- Joined: Fri Nov 26, 2010 2:00 pm
- Location: Czech Republic
- Full name: Martin Sedlak
Re: Static Code Analysis
It's not that bad. Do you do bulk counting of leaf nodes? This can give a nice speed up.jsgroby wrote:Right now my engine is taking about 10 seconds to get to perft 6 on FEN entries that Stockfish and other engines are hitting in about 2-3 seconds.
Also if you use staged move generator having a full legal one for perft gives a small speed up as well.
EDIT: move generator is only part of the story, it's also important how fast you can make/unmake moves
For example I do incremental update of psqs and material key which can be omitted in perft. Also (unless you hash perft) you don't have to update hash key(s).
Those can give a small speed up too.
-
- Posts: 83
- Joined: Mon Mar 24, 2014 12:26 am
- Location: Glen Carbon, IL USA
Re: Static Code Analysis
Thanks Martin. You have given me some ideas where to start poking around. I am examining my make move and take move functions and doing some overall testing with my perft function to see what impact it has.mar wrote:It's not that bad. Do you do bulk counting of leaf nodes? This can give a nice speed up.jsgroby wrote:Right now my engine is taking about 10 seconds to get to perft 6 on FEN entries that Stockfish and other engines are hitting in about 2-3 seconds.
Also if you use staged move generator having a full legal one for perft gives a small speed up as well.
EDIT: move generator is only part of the story, it's also important how fast you can make/unmake moves
For example I do incremental update of psqs and material key which can be omitted in perft. Also (unless you hash perft) you don't have to update hash key(s).
Those can give a small speed up too.
I did try some compiler optimizations and got the following results:
No compiler optimization resulted in 25059ms for a perft 6
-O2 compiler optimization resulted in 10421ms for a perft 6
-O3 compiler optimization resulted in 10111ms for a perft 6
-O3 -ffast-math optimization resulted in 10149ms for a perft 6
Position used was starting FEN,
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
-
- Posts: 6442
- Joined: Tue Jan 09, 2007 12:31 am
- Location: PA USA
- Full name: Louis Zulli
Re: Static Code Analysis
My guess is -ffast-math might be helpful for code with many floating point ops, which you probably don't have.jsgroby wrote:
-O3 compiler optimization resulted in 10111ms for a perft 6
-O3 -ffast-math optimization resulted in 10149ms for a perft 6
Position used was starting FEN,
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
As Martin said, this is not really a key issue in the big picture. But you might try letting the compiler do some profile-guided optimization for you. Assuming you are using gcc, you can follow the steps in my model below:
Code: Select all
ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -fprofile-generate -o perft Perft.c
ProcyonLeo: ~/Documents/Chess/Kirby] ./perft
FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
Depth = 6
Leaf nodes = 119060324
Time taken = 8069 ms
ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -fprofile-use -o perft Perft.c
ProcyonLeo: ~/Documents/Chess/Kirby] ./perft
FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
Depth = 6
Leaf nodes = 119060324
Time taken = 5442 ms
Code: Select all
ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -o perft Perft.c
ProcyonLeo: ~/Documents/Chess/Kirby] ./perft
FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
Depth = 6
Leaf nodes = 119060324
Time taken = 6394 ms
-
- Posts: 83
- Joined: Mon Mar 24, 2014 12:26 am
- Location: Glen Carbon, IL USA
Re: Static Code Analysis
Thanks Louis. I will indeed try that.zullil wrote:My guess is -ffast-math might be helpful for code with many floating point ops, which you probably don't have.jsgroby wrote:
-O3 compiler optimization resulted in 10111ms for a perft 6
-O3 -ffast-math optimization resulted in 10149ms for a perft 6
Position used was starting FEN,
rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
As Martin said, this is not really a key issue in the big picture. But you might try letting the compiler do some profile-guided optimization for you. Assuming you are using gcc, you can follow the steps in my model below:For comparison, here's without the PGO:Code: Select all
ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -fprofile-generate -o perft Perft.c ProcyonLeo: ~/Documents/Chess/Kirby] ./perft FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - Depth = 6 Leaf nodes = 119060324 Time taken = 8069 ms ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -fprofile-use -o perft Perft.c ProcyonLeo: ~/Documents/Chess/Kirby] ./perft FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - Depth = 6 Leaf nodes = 119060324 Time taken = 5442 ms
Code: Select all
ProcyonLeo: ~/Documents/Chess/Kirby] gcc -Wall -O3 -o perft Perft.c ProcyonLeo: ~/Documents/Chess/Kirby] ./perft FEN string = rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - Depth = 6 Leaf nodes = 119060324 Time taken = 6394 ms
Jeff
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Static Code Analysis
I have used oprofile (http://oprofile.sourceforge.net/about/), which works on both AMD & Intel systems, and VTune (https://software.intel.com/en-us/intel- ... plifier-xe), which is Intel only (there is a free version for non-commercial use on Linux: it is costly for Windows, although there is a free trial). These are very helpful although they are complex tools. You can use them for just the basics (finding code execution hotspots) but anything beyond that will require a good knowledge of modern CPU and PC system architecture.
--Jon
--Jon