Thanks Sven for this detailed answer! I really appreciate it.
Since I'm new to chess programming, I'm apparently too obsessed with speed. Meanwhile, I guess I need to spend more time building the actual search functions than tweaking perft.
Perft search speed bottleneck
Moderators: hgm, Dann Corbit, Harvey Williamson
-
JarvisXerxes
- Posts: 20
- Joined: Thu May 02, 2013 5:52 pm
- Location: United States
Re: Perft search speed bottleneck
Life is a statistical distribution.
-
syzygy
- Posts: 5554
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Perft search speed bottleneck
Given that your approach to perft is so different than that of qperft, I don't think you should be bothered at all by the difference in speed.JarvisXerxes wrote:Thanks a lot to HGM for your detailed explanation. This clarifies a lot of mysteries. I'm glad to listen to advice directly from qperft's author.
I'd try out the proposed optimizations and restructure my movegen.
When you do an alpha/beta search, you will need to make at least all legal moves towards the leaf nodes. Checking for legality before making the move speeds up perft enormously, but will only marginally, if at all, pay off in a regular search (since you'll end up making most moves anyway). It might be worth it to give it a try at some point, but if you're just starting with your engine there are far easier ways to make progress.
-
hgm
- Posts: 27703
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Perft search speed bottleneck
It depends how you accomplish it. If you generate strictly legal moves by first generating all pseudo-legal moves, and then testing them for legality, it throws away the baby with the bath water, because you will be testing all sorts of moves for legality that you will never play because of beta cutoffs. OTOH, if you really only generate legal moves, it saves you time by not having to generate illegal ones (as with the pin-detection).JarvisXerxes wrote:So will the search speed drop if I generate strictly legal moves only? If not, I'd prefer to verify inside moveGen(). Then I don't need to test legality every time I make a move.
-
syzygy
- Posts: 5554
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Perft search speed bottleneck
Then you were not using a dedicated evasion generator when in check.Chan Rasjid wrote:Illegal moves that leave one's own king open to slider attack is not rare. I measured it. In late-middle or endgame, the percentage of illegal psuedo-legal moves may be 30/40%.
-
Chan Rasjid
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Perft search speed bottleneck
I think it is best to follow what the top experts(Ippolit, komodo) are doing in storing moves - done right at the very start and you never ever need to make changes to this part of your code:JarvisXerxes wrote: My move representation follows the common approach: a dozen bits for the from/to squares, a dozen bits for the captured/moving pieces and a few more for special flags (castling, EP capture).
When I pass a Move object (basically a 32-int) to make/unmake(), they simply call get() methods on the move object to obtain relevant info.
1) store move with an integer.
2) lower 12 bits for to/from square.
3) next 3 bits for promote type - if any; or it means not a promotion move. Komodo has minor other uses for these 3 bits; pawn push x2 ?
4) remaining 1 bit of lower 16 bits - unused; or for later use to set checking moves when info available.
5) upper 16 bits for other use - order scores for capture moves; or unused for quiet moves.
I you think a little, you would understand why most mature programs follow what's above and nothing more. There is no need to store castling/ep/captured piece, etc as those information are available when we come to making the moves; or we usually have to get those info before making the moves. Undoing a captured piece comes from a stack history where we store the captured piece type.
I can suggest an alternative to what Lucas is doing where his movegen() generates fully legal moves (may be some minor exceptions).
I seem to recall looking into the movegen() of Ippolit, a very top engine, and found it does only a plain generation of raw pseudo-legal moves - nothing else. So I believe that validation is done (before make) by calling an external routine is_move_legal(pseudo_move).
I can understand why it is so. We call movegen() probably millions of times per second. But very often, only the first few moves are tried and search has a cutoff; the rest of the moves are discarded. So we try not to do any pre-calculation on the move list.
Within the routine is_move_valid(), if it detects that the move is invalid (when our king is exposed to sliding attack), we can do the following. We now have the sliding bitboard attack map from our king that hits an enemy slider; save that attack map. Scan the the move list forward (you could pass a pointer to the move as an argument). Any moves with the same from-square where the to-square falls outside of the valid 'blocking' ray means that that move is invalid and we can remove it from the list (swap to tail end). This way, we don't need pre-calculation.
Best Regards,
Rasjid.
-
hgm
- Posts: 27703
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Perft search speed bottleneck
I always felt it wast just the other way around. When you have an engine, you generally won't make any moves of those you generate in the leaf nodes. (If you did, they would not be leaf nodes...)syzygy wrote:When you do an alpha/beta search, you will need to make at least all legal moves towards the leaf nodes. Checking for legality before making the move speeds up perft enormously, but will only marginally, if at all, pay off in a regular search (since you'll end up making most moves anyway). It might be worth it to give it a try at some point, but if you're just starting with your engine there are far easier ways to make progress.
So by making the final ply in a perft you destroy all relationship between the perft and the efficiency of the engine on which it was based. You would be optimizing MakeMove() rather than MoveGen(), while an engine spends much more time in MoveGen() than in MakeMove().
The mistake that most people make is to identify the perft count with the number of leaves that is searched. The leaves of a perft(N) are really the nodes at the N-1st ply.
-
Chan Rasjid
- Posts: 588
- Joined: Thu Mar 09, 2006 4:47 pm
- Location: Singapore
Re: Perft search speed bottleneck
I have a dedicated evasion generator. But I don't have a very accurate calculator for adding numbers.syzygy wrote:Then you were not using a dedicated evasion generator when in check.Chan Rasjid wrote:Illegal moves that leave one's own king open to slider attack is not rare. I measured it. In late-middle or endgame, the percentage of illegal psuedo-legal moves may be 30/40%.
I too was surprised with the 30/40 % invalid moves. Right now, I cannot do any confirmation as I am making changes to my program and it cannot run.
Rasjid.
-
hgm
- Posts: 27703
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Perft search speed bottleneck
Actually I always found it much more efficient to not use difficult-to-unpack layouts in the move format. I.e. uses the low-order 16 bits as 8 + 8 for from and to-square, so they can be accessed by simple byte accesses, rather than load + shift + mask. As 8 bits is usually more than you need to identify a square, you will have many off-board codes available to encode 'exceptions', which would trigger special code in MakeMove() for moves that would not follow the simple move+replace pattern, but have side effects (promotion, castling, e.p., double-push) that have to be handled. The latter group of moves is so rare that it hardly matters in what way you implement these side effects; the only thing that impacts efficiency is how fast you can determine that there are no such effects (i.e. the common case). And for almost any board format there would be extremely fast tests to see if the square is off-board (e.g. sq>63 for 8x8 or bitboard, sq&0x88 for 0x88, victim==GUARD for border-guard boards, etc.).Chan Rasjid wrote:I think it is best to follow what the top experts(Ippolit, komodo) are doing in storing moves - done right at the very start and you never ever need to make changes to this part of your code:
1) store move with an integer.
2) lower 12 bits for to/from square.
3) next 3 bits for promote type - if any; or it means not a promotion move. Komodo has minor other uses for these 3 bits; pawn push x2 ?
4) remaining 1 bit of lower 16 bits - unused; or for later use to set checking moves when info available.
5) upper 16 bits for other use - order scores for capture moves; or unused for quiet moves.
-
Steve Maughan
- Posts: 1218
- Joined: Wed Mar 08, 2006 8:28 pm
- Location: Florida, USA
Re: Perft search speed bottleneck
Hi Jim,
You might find this post interesting:
http://www.chessprogramming.net/compute ... rft-speed/
I had a significant breakthrough why I found the pinned pieces at the start of my GenMoves routine. I then checked for legality before I call MakeMove. As others have pointed out, you only need to do this check of the piece is pinned or the move is a king or en-passant move. This was a win for me.
Steve
You might find this post interesting:
http://www.chessprogramming.net/compute ... rft-speed/
I had a significant breakthrough why I found the pinned pieces at the start of my GenMoves routine. I then checked for legality before I call MakeMove. As others have pointed out, you only need to do this check of the piece is pinned or the move is a king or en-passant move. This was a win for me.
Steve
-
spambanane
- Posts: 22
- Joined: Sun Jun 17, 2012 9:45 am
Re: Perft search speed bottleneck
hi harm geert, i always wonder how you work efficiently with byte values. i tried to make my data structure more dense, but i always end up with slower code, due to the fact that loading byte values into an 32 bit register is slow (just a guess, you know better probably.). so i use int32 most of the times. how do you (and others) deal with that? any resources about this matter in the web you can recommend?hgm wrote:Actually I always found it much more efficient to not use difficult-to-unpack layouts in the move format. I.e. uses the low-order 16 bits as 8 + 8 for from and to-square, so they can be accessed by simple byte accesses, rather than load + shift + mask.
(i don't want to hijack this topic, in case i have more questions i will open a new topic.)