Help Needed Interpreting Profiler Output!

Discussion of chess software programming and technical issues.

Moderators: hgm, Dann Corbit, Harvey Williamson

User avatar
Steve Maughan
Posts: 1218
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Help Needed Interpreting Profiler Output!

Post by Steve Maughan »

I decided to explore Maverick's code through the lens of a profiler. My native tongue (Delphi) isn't strong in this area, so this is new ground for me.

The output seemed a little odd and I wonder if a compiler / profiling Ninja may be able to shed some light.

Here's the output:

Image

Most of the procedures look familiar. But there are two which, as far as I know, are not in my code - these are "_mcount_private" and "__fentry__". And they are sucking up a lot of CPU juice.

Does anyone know what they are? And the obvious follow-on question, is there any way to reduce their load?

Many thanks,

Steve
User avatar
hgm
Posts: 27702
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help Needed Interpreting Profiler Output!

Post by hgm »

This will be very compiler dependent. _fentry_ sounds like a routine that would be called to allocate a new stack frame, every time you call a function / routine. It typically saves the old frame pointer (pointing at the block of local variables of the caller) on the system stack, sets the frame pointer pointing at a memory area for the local variables of the called routine on top of that stack, and adjusts the stack pointer to point after it. In addition it could save a number of CPU registers that the compiler assumes can be freely used for temporaries.

To reduce the time it consumes, you should do fewer function calls.

(Btw, your profiler output is not visible.)
User avatar
Steve Maughan
Posts: 1218
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Re: Help Needed Interpreting Profiler Output!

Post by Steve Maughan »

Thanks H.G,!

So I guess inlining would help.

Here's another go at showing the image:
Image
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Help Needed Interpreting Profiler Output!

Post by Sven »

Steve Maughan wrote:I decided to explore Maverick's code through the lens of a profiler. My native tongue (Delphi) isn't strong in this area, so this is new ground for me.

The output seemed a little odd and I wonder if a compiler / profiling Ninja may be able to shed some light.

Here's the output:

[...]

Most of the procedures look familiar. But there are two which, as far as I know, are not in my code - these are "_mcount_private" and "__fentry__". And they are sucking up a lot of CPU juice.

Does anyone know what they are? And the obvious follow-on question, is there any way to reduce their load?

Many thanks,

Steve
These two functions are obviously internal functions of gprof. "_mcount_private" definitely is, and "__fentry__" looks like that as well. gprof uses instrumentation as well as sampling, where instrumentation means that some code is added to the program to achieve the profiling functionality. That new code is not for free, you will always see a (more or less significant) overhead when using an instrumenting profiler. An overhead of more than 30% is not unusual. To obtain your "real" numbers you will have to filter out those internal functions appropriately by subtracting their total time values from the overall sum and then recalculating all relevant percent values.

Regarding your profiling results, I think that you might want to have a look at your legality check implementation. Your numbers were from profiling "perft" so they are not fully representative for an actual search (which would be most relevant), but one can already see that your legality checking (presumably covered by a function named like "is_in_check_after_move()") consumes almost as much time as the whole move generation.

So my advice would be:
1. Do everything that needs to be done to get a stable engine first, of course.
2. Then do everything to get a strong engine ...
3. And then, when you actually want to start optimizing the performance of your engine, please consider to improve your legality checking as follows.

a) You can safely skip legality testing of pseudo-legal moves that belong to none of these categories:
- check evasions,
- king moves (including castles),
- en passant captures,
- moves of a pinned piece.

Only these can ever be illegal, and statistically they are quite rare in total, so you can actually save a lot of redundant legality testing.

b) For moves from the categories above, it is not always necessary to perform a "makeMove - legality test - unmakeMove" sequence, it can be possible to do a "static" legality test as well that saves the make/unmake overhead.

Please note, however, that any optimization on that level, especially in the area of basic operations like move generation/legality checking, tends to produce the following two typical outcomes:

1. introduction of new bugs,
2. a performance improvement of only very few percent for the real search.

One reason for the latter is that move generation itself usually consumes only a small fraction of the overall search time (10% or less might be a typical number) so any relative improvement of that component gets divided by a factor of 10 or more. Compare 10/100 to 8/98, that is what you can expect from a 20% speedup of the move generator. (All numbers are just examples.)

Sven
User avatar
hgm
Posts: 27702
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Help Needed Interpreting Profiler Output!

Post by hgm »

When I paste the image link directly in the browser, it says this:

The image “https://www.evernote.com/shard/s5/sh/bf ... skitch.png” cannot be displayed, because it contains errors.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Help Needed Interpreting Profiler Output!

Post by Sven »

hgm wrote:When I paste the image link directly in the browser, it says this:

The image “https://www.evernote.com/shard/s5/sh/bf ... skitch.png” cannot be displayed, because it contains errors.
With firefox I do not get such a message. But when I opened this thread for the first time I was unable to see the image as well (therefore I tried the same as you did). Now everything seems to be fine.

Sven
User avatar
Steve Maughan
Posts: 1218
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Re: Help Needed Interpreting Profiler Output!

Post by Steve Maughan »

Hmmmm, then it looks like Skitch may not be the best way to post images into TalkChess.

Thanks,

Steve
User avatar
Steve Maughan
Posts: 1218
Joined: Wed Mar 08, 2006 8:28 pm
Location: Florida, USA

Re: Help Needed Interpreting Profiler Output!

Post by Steve Maughan »

Hi Sven,
Sven Schüle wrote:These two functions are obviously internal functions of gprof. "_mcount_private" definitely is, and "__fentry__" looks like that as well. gprof uses instrumentation as well as sampling, where instrumentation means that some code is added to the program to achieve the profiling functionality. That new code is not for free, you will always see a (more or less significant) overhead when using an instrumenting profiler. An overhead of more than 30% is not unusual. To obtain your "real" numbers you will have to filter out those internal functions appropriately by subtracting their total time values from the overall sum and then recalculating all relevant percent values.
Oh! That makes sense!
Sven Schüle wrote:(...) I think that you might want to have a look at your legality check implementation.(...)
Yes - this was my conclusion. I thought my legality check was good - it's obviously terrible! The problem seem to be I'm checking to see if the piece is pinned from scratch after every move. Of course it will be much faster to generate the pins and store this as a bitboard. Then use this bitboard as the first port-of-call in the legality checker. This should speed things up.

Many thanks!!

Steve
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Help Needed Interpreting Profiler Output!

Post by Sven »

Steve Maughan wrote:The problem seem to be I'm checking to see if the piece is pinned from scratch after every move. Of course it will be much faster to generate the pins and store this as a bitboard. Then use this bitboard as the first port-of-call in the legality checker. This should speed things up.
Yes, you can do that and you should do that. And no, I don't think this will help significantly. According to your numbers checking for pins does not take much time. But look at the huge number of calls of is_in_check_after_move(), it is called 2594 million times while make_move() is called 82 million times, so roughly 32 calls per make_move(). That tells me that you test all pseudo-legal moves for legality by internally making them (not with make_move() but with something smaller, I suppose), then checking whether the king was left in check, then unmaking them again. And that is what I meant in my other post: you don't need to do that for most of the moves.

Example:

Code: Select all

bool is_legal_move(move) {
    if (is_check_evasion || is_king_move || is_ep_capture || is_pinned_piece_move) {
        make_internal(move);
        bool is_legal = !is_in_check_after_move();
        unmake_internal(move);
        return is_legal;
    } else {
        return true;
    }
}
Of course a real implementation is a bit more complex but not much.

Your version does something like this per node:
32x make_internal()
32x is_in_check_after_move()
32x unmake_internal()

The version above would to something like this per node, for some N that is much smaller than 32:
32x is_legal_move()
Nx make_internal()
Nx is_in_check_after_move()
Nx unmake_internal()

so it saves quite a bit.

Sven
spambanane
Posts: 22
Joined: Sun Jun 17, 2012 9:45 am

Re: Help Needed Interpreting Profiler Output!

Post by spambanane »

I can't see the linked image in my browser, so I had to use wget to download the image, which worked. Better use imageshack which is known to work. And even better is to post the table as plain text.

back to topic:
First let me say that I use a 15x12 board, so maybe some things I write are not true for bitboards.

As pinned pieces are a rarity I check for pins from scratch every time. I start with checking if king and pin candidate are on a ray. Most of the times this is not the case and I'm done.

For legality checking of moves I distinguish between positions in check and not in check. When in check I use a check evasion move generator which returns legal moves only. Most of the times the position is not in check, and I use a dedicated IsLegalMoveWhileNotInCheck() (yes, i know, this name is long and sucks) function to check if a move is legal. I do that *before* MakeMove(), even if the king moves. King moves are still special, i have to check if the target square is under attack, which is costly. Anything else - even e.p. - can be checked explicitly by examining rays from king to find attackers. You won't gain a lot with this approach because most of the times your move will not leave the king in check anyway.

And for using gprof (and the question if it really makes sense) found an interesting article here:

http://stackoverflow.com/questions/1777 ... 43#1779343