Minic is twice as slow as others 2800-3000 elo engines.
Minic is copy-make (not make unmake) and use HQBB not magic.
Moreover, Minic is sorting all moves as soon as TT move is not a beta cut-off, and this is quite slow due to detection of bad capture (SEE).
When profiling it, eval is the first consuming spot, with 15%, then is piece movement generation (for move generation and attack) with 12% and then comes move sorting with 10%. Then are things like use of TT, pvs and qsearch, ...
I tried profiling with linux perf, gprof, with visual studio, but it feels like results are not totally "real" to me.
Being twice as fast at this level is around what ? 45elo ? (this is what I get with 2 threads).
Do you think I shall go for make-unmake and magic BB or try to improve move ordering and pruning with stuff like counter move history?
Minic raw speed
Moderators: hgm, Rebel, chrisw
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
-
- Posts: 2487
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Minic raw speed
You could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).
I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.
That could about halve the time spent in sorting.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Minic raw speed
ok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)Ras wrote: ↑Sat Nov 09, 2019 12:38 pmYou could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).
I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.
That could about halve the time spent in sorting.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Minic raw speed
The only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?xr_a_y wrote: ↑Sat Nov 09, 2019 4:21 pmok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)Ras wrote: ↑Sat Nov 09, 2019 12:38 pmYou could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).
I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.
That could about halve the time spent in sorting.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Minic raw speed
That sounds like a weird question to me
I define which capture is good or bad using SEE, how can I know a capture is good before SEE ?.
-
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Minic raw speed
If the moving piece is of equal or lower value compared to the victim (e.g. QxQ, BxR) then you can save SEE since the SEE score would almost never be <0. So "losing capture" in this case is a cheap pre-classification of captures, not a result returned from SEE.
Sven Schüle (engine author: Jumbo, KnockOut, Surprise)
-
- Posts: 1871
- Joined: Sat Nov 25, 2017 2:28 pm
- Location: France
Re: Minic raw speed
Let's try
-
- Posts: 2487
- Joined: Tue Aug 30, 2016 8:19 pm
- Full name: Rasmus Althoff
Re: Minic raw speed
You could use MVV/LVA at this stage because that's cheap, pick the highest one of these and see whether you get a cut-off. If not, you can score and sort the remaining ones via SEE.
Rasmus Althoff
https://www.ct800.net
https://www.ct800.net
-
- Posts: 3232
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Minic raw speed
How would know that a capture is "losing" without calculating its SEE ?Sven wrote: ↑Sat Nov 09, 2019 7:14 pmThe only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?xr_a_y wrote: ↑Sat Nov 09, 2019 4:21 pmok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)Ras wrote: ↑Sat Nov 09, 2019 12:38 pmYou could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).
I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.
That could about halve the time spent in sorting.
Edit: Just saw your answer on the other post. Yes, of course, I do that, it's a basic SEE optimisation.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
- Posts: 1563
- Joined: Thu Jul 16, 2009 10:47 am
- Location: Almere, The Netherlands
Re: Minic raw speed
Nightmare uses partial copy-make and make-unmake, for things that are small in size and time consuming to compute (like the hashes etc.) I use copy-make for other things (like updating the piece array) I use make-unmake.xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 am Minic is twice as slow as others 2800-3000 elo engines.
Minic is copy-make (not make unmake) and use HQBB not magic.
Moreover, Minic is sorting all moves as soon as TT move is not a beta cut-off, and this is quite slow due to detection of bad capture (SEE).
When profiling it, eval is the first consuming spot, with 15%, then is piece movement generation (for move generation and attack) with 12% and then comes move sorting with 10%. Then are things like use of TT, pvs and qsearch, ...
I tried profiling with linux perf, gprof, with visual studio, but it feels like results are not totally "real" to me.
Being twice as fast at this level is around what ? 45elo ? (this is what I get with 2 threads).
Do you think I shall go for make-unmake and magic BB or try to improve move ordering and pruning with stuff like counter move history?
I sort captures in exactly in the same way as you, if the TT-move doesn't produce a beta cutoff I score all captures with SEE and sort them in 2 stages, first I sort all winning and equal captures, the losing captures are sorted at the end (if all the other moves didn't produce a beta cutoff). In quiescence I sort all captures at once (winning and losing).
Using MVV/LVA for the first capture after the TT-move helps a tiny bit, but this doesn't make a world of a difference, in fact I don't use it at all at the moment. Nightmare on a single core does between 2.5 and 4 mnps (depending upon the stage of the game), I don't know how fast others are but this doesn't seem slow to me.
I measured with __rdtsc() how many clock cycles SEE takes on average, in my case this is ~38 cycles (10 nsec on my computer), if I add the MVV/LVA trick this drops to an average of ~36 cycles, this is hardly noticeable in practice.
I don't know what HQBB is, I use pext() (or magics when I have to run on an AMD processor).
Usually I see a larger improvement with speed doubling, something like 60 Elo, it can depend upon the engine though, I have the impression that the more I prune the less influence speed has.
If your engine is twice as slow as others you should work on that, improving the branching factor will help of course, but if you can manage to double the speed it will give you another 60 Elo.