Just a little feedback about this experiment I did in Minic. Short answer : I didn't succeed.
Long answer, I try a staged move generator working like this :
* return TT move (without validating it, in Minic it cannot be wrong expect in case of a real hash collision)
* Gen Cap and sort them (mvv-lva + see)
* return Good Cap
* return killers (validating them as suggested here)
* Gen quiet and sort them (history + pst)
* return quiet
In fact it cost me too much lines and involve (in my opinion) to much management. Indeed, using the HQ bitboard generator, generating all moves or only captures does not impact much cpu consomption, I kept this possibility as a tool for qsearch anyway (to avoid an expensive if isCapture inside the moves loop in qsearch). The only things that can be gain are :
- the TT move leads to a cutoff before generating moves list : I kept that without using a staged move generator
- a good capture leads to a cutoff : but the time to sort them (including SEE) is by far the most expensive task in sorting move ...
- a killer leads to a cutoff : so that quiet moves are not generated and sorted, but killers has to be validated.
So at the end, I only kept :
- the possibility to generate only cap (used in qsearch)
- try the TT move first without generating anything
What I probably still miss is :
- generating checks and checks evasions only to be used in qsearch (does someone knows how it impact strength ?, and an efficient way of doing this without incremental attack update?)
- sort capture really using the SEE value, so that bad capture are well sorted, for now my SEE only return true/false meaning if it is a good or bad cap
Staged move generation and killers
Moderators: hgm, Rebel, chrisw
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Staged move generation and killers
When I use staged move generation the main advantage comes from also staging the generation of captures themselves: I generate them by value group of the victim (i.e. Queens, Rooks, minors, Pawns). So if the capture of a Queen provides a cutoff, I never get to generating and sorting captures on Rooks, minors and Pawns.
This also strongly reduces the effort taken by sorting: often value groups consist of just a single victim, with just one or perhaps two attackers, and the moves then only have to be sorted by attacker value. (And only if there is more than one.)
Validation of killers should not be that costly, in terms of CPU time. But it will require some dedicated code that serves no other purpose.
I very much doubt if sorting bad captures by SEE will buy you anything. For one, bad captures will almost always fail low, so you would have to search them all anyway. They only stand a chance of being cut moves when there are side effects that SEE cannot recognize, and this doesn't correlate that well with the actual SEE value. So in the rare case that any of the bad captures was good, searching them by SEE value will probably not help you getting the good one any quicker, on average.
This also strongly reduces the effort taken by sorting: often value groups consist of just a single victim, with just one or perhaps two attackers, and the moves then only have to be sorted by attacker value. (And only if there is more than one.)
Validation of killers should not be that costly, in terms of CPU time. But it will require some dedicated code that serves no other purpose.
I very much doubt if sorting bad captures by SEE will buy you anything. For one, bad captures will almost always fail low, so you would have to search them all anyway. They only stand a chance of being cut moves when there are side effects that SEE cannot recognize, and this doesn't correlate that well with the actual SEE value. So in the rare case that any of the bad captures was good, searching them by SEE value will probably not help you getting the good one any quicker, on average.
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Staged move generation and killers
I like that ! In Minic I use a pseudo "GeneratorPhase" enum this wayhgm wrote: ↑Mon Dec 31, 2018 11:02 am When I use staged move generation the main advantage comes from also staging the generation of captures themselves: I generate them by value group of the victim (i.e. Queens, Rooks, minors, Pawns). So if the capture of a Queen provides a cutoff, I never get to generating and sorting captures on Rooks, minors and Pawns.
Code: Select all
BitBoard bb = pf[ptype-1](from, p.occupancy, p.c) & ~myPieceBB;
if (phase == GP_cap) bb &= oppPieceBB; // only target opponent piece
else if (phase == GP_quiet) bb &= ~oppPieceBB; // do not target opponent piece
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Staged move generation and killers
I don't think there's much to win anyway. Just run a perft 6 on the initial position with make/unmake in the leaves. And that is already a cautious estimation because when you get cutoffs in real search, make/unmake will not be done for the remaining nodes so that the actual possible gain is even smaller. Then run a full search analysis.
You'll probably find that the full search node rate is around 15% of the perft rate. Even if you ignore the effort in make/unmake and assume that the move generator is the bottleneck in perft, making the move generator twice as efficient by not generating moves that won't be executed, you'd gain half of that 15% in speed, i.e. 7.5%. Assuming 60 Elo per doubling, we're talking about 6 Elo tops. When also considering make/unmake, maybe 3 Elo.
You'll probably find that the full search node rate is around 15% of the perft rate. Even if you ignore the effort in make/unmake and assume that the move generator is the bottleneck in perft, making the move generator twice as efficient by not generating moves that won't be executed, you'd gain half of that 15% in speed, i.e. 7.5%. Assuming 60 Elo per doubling, we're talking about 6 Elo tops. When also considering make/unmake, maybe 3 Elo.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Staged move generation and killers
Make/unmake will not be done for any leaf node in search, right? Otherwise they would not be leaf nodes, by definition. In most nodes you just generate moves (captures), to conclude that all are futile. So it seems wrong to compare with a perft that does make/unmake in the leaves; this is nothing like search.Ras wrote: ↑Mon Dec 31, 2018 1:10 pm I don't think there's much to win anyway. Just run a perft 6 on the initial position with make/unmake in the leaves. And that is already a cautious estimation because when you get cutoffs in real search, make/unmake will not be done for the remaining nodes so that the actual possible gain is even smaller.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Staged move generation and killers
Perft and regular search can't be compared for other reasons as well. So I would not try to analyze possible savings of staged move generation by comparing search to Perft. But the point of Rasmus is still valid: from a theoretic viewpoint there is not much room for a speedup by using staged move generation because move generation takes only a small portion of overall engine runtime. And yet there are engines using it successfully, so there must be more to it ...hgm wrote: ↑Tue Jan 01, 2019 9:46 amMake/unmake will not be done for any leaf node in search, right? Otherwise they would not be leaf nodes, by definition. In most nodes you just generate moves (captures), to conclude that all are futile. So it seems wrong to compare with a perft that does make/unmake in the leaves; this is nothing like search.Ras wrote: ↑Mon Dec 31, 2018 1:10 pm I don't think there's much to win anyway. Just run a perft 6 on the initial position with make/unmake in the leaves. And that is already a cautious estimation because when you get cutoffs in real search, make/unmake will not be done for the remaining nodes so that the actual possible gain is even smaller.
In my engine Jumbo I have almost completed the implementation of staged move generation recently. "Almost" because I am still struggling to manage the delayed processing of losing captures correctly without losing strength. The initial step of trying the TT move first without generating any moves was the most successful one, it gave quite some strength improvement, but all later additions like separating the generation of captures and quiet moves did not improve strength significantly (i.e., nothing outside the error bars of roughly +/- 7 Elo points in my setup).
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Staged move generation and killers
In my engine I can easily switch between staged and non staged move generation, the difference in speed or strength is negligible and disappears completely in the noise. So you can argue if it is really worth the added complexity, it is usually better to keep things as simple and stupid as possible.
Of course this has nothing to do with first trying the TT-move, which I do in both cases.
Of course this has nothing to do with first trying the TT-move, which I do in both cases.
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Staged move generation and killers
Right before that, it will. Then eval would follow. My point is that search is not like the bulk counting perft version because in search, you don't generate a move list just for counting the moves and then discard them, but you will make/unmake at least some of these moves. That's why the perft version with make/unmake is more representative. Also, I chose this optimistic assumption to figure out an upper bound of the gain.
However, if you instead argue that the optimised perft version with bulk counting is more like search, then my original argument just becomes stronger because then the amount of possible time reduction is even less. Then we aren't talking about some 10M per second perft, but rather a 100M per second, like in your optimised perft code. Then the numbers instead go like this:
Search node rate 2 Mnps, which is not 15% of the perft rate anymore, but only 2%. Now we make the move generation twice as efficient and save half of that 2%, so 1% speedup. That's even less than 1 Elo.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 27808
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Staged move generation and killers
Actually that is exactly what a search does for most of the time: generate moves just to conclude that none of those is worth searching. Leaves where no move generation is needed because stand-pat produced a cutoff are rare, because futility pruning eliminates most of those.
Your math doesn't add up. A bulk-counting perft only does ~1Mnps, not 100Mnps. When you don't make the moves you generate, move generation of course will take a larger fraction of the time, not a smaller one.
Your math doesn't add up. A bulk-counting perft only does ~1Mnps, not 100Mnps. When you don't make the moves you generate, move generation of course will take a larger fraction of the time, not a smaller one.
-
- Posts: 2488
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Staged move generation and killers
Most of the move generation is in quiescence, and making none of the moves there sounds strange. Or do you exclude high takes low captures e.g. via SEE? Of course, in QS, you only generate captures and not quiet moves (except maybe for check evasions).
My full search already does around 1.5 - 2 Mnps, and perft does around 11 already without bulk counting. But none of this changes the fact that move generation doesn't take considerable amount of time so that the potential for optimisation is not that big - with the obvious exception of only generating captures in QS. Besides, the number of nodes in QS even dominates regular search, which is another reason why staged move generation in the main search won't yield gains.A bulk-counting perft only does ~1Mnps, not 100Mnps.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net