Minic raw speed

xr_a_y · Post by **xr_a_y** » Sat Nov 09, 2019 11:25 am

Minic is twice as slow as others 2800-3000 elo engines.
Minic is copy-make (not make unmake) and use HQBB not magic.
Moreover, Minic is sorting all moves as soon as TT move is not a beta cut-off, and this is quite slow due to detection of bad capture (SEE).

When profiling it, eval is the first consuming spot, with 15%, then is piece movement generation (for move generation and attack) with 12% and then comes move sorting with 10%. Then are things like use of TT, pvs and qsearch, ...

I tried profiling with linux perf, gprof, with visual studio, but it feels like results are not totally "real" to me.

Being twice as fast at this level is around what ? 45elo ? (this is what I get with 2 threads).
Do you think I shall go for make-unmake and magic BB or try to improve move ordering and pruning with stuff like counter move history?

Ras · Post by **Ras** » Sat Nov 09, 2019 12:38 pm

xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 amMinic is sorting all moves as soon as TT move is not a beta cut-off

You could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).

I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.

That could about halve the time spent in sorting.

xr_a_y · Post by **xr_a_y** » Sat Nov 09, 2019 4:21 pm

Ras wrote: ↑Sat Nov 09, 2019 12:38 pm
xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 amMinic is sorting all moves as soon as TT move is not a beta cut-off
You could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).

I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.

That could about halve the time spent in sorting.

ok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)

Sven · Post by **Sven** » Sat Nov 09, 2019 7:14 pm

xr_a_y wrote: ↑Sat Nov 09, 2019 4:21 pm
Ras wrote: ↑Sat Nov 09, 2019 12:38 pm
xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 amMinic is sorting all moves as soon as TT move is not a beta cut-off
You could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).

I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.

That could about halve the time spent in sorting.
ok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)

The only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?

xr_a_y · Post by **xr_a_y** » Sat Nov 09, 2019 7:16 pm

Sven wrote: ↑Sat Nov 09, 2019 7:14 pm The only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?

That sounds like a weird question to me

I define which capture is good or bad using SEE, how can I know a capture is good before SEE ?.

Sven · Post by **Sven** » Sat Nov 09, 2019 7:21 pm

xr_a_y wrote: ↑Sat Nov 09, 2019 7:16 pm
Sven wrote: ↑Sat Nov 09, 2019 7:14 pm The only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?
That sounds like a weird question to me
I define which capture is good or bad using SEE, how can I know a capture is good before SEE ?.

If the moving piece is of equal or lower value compared to the victim (e.g. QxQ, BxR) then you can save SEE since the SEE score would almost never be <0. So "losing capture" in this case is a cheap pre-classification of captures, not a result returned from SEE.

xr_a_y · Post by **xr_a_y** » Sat Nov 09, 2019 7:25 pm

Let's try

Ras · Post by **Ras** » Sat Nov 09, 2019 10:16 pm

xr_a_y wrote: ↑Sat Nov 09, 2019 4:21 pmSo if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)

You could use MVV/LVA at this stage because that's cheap, pick the highest one of these and see whether you get a cut-off. If not, you can score and sort the remaining ones via SEE.

lucasart · Post by **lucasart** » Sun Nov 10, 2019 1:37 am

Sven wrote: ↑Sat Nov 09, 2019 7:14 pm
xr_a_y wrote: ↑Sat Nov 09, 2019 4:21 pm
Ras wrote: ↑Sat Nov 09, 2019 12:38 pm
xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 amMinic is sorting all moves as soon as TT move is not a beta cut-off
You could benchmark how often your first move cuts off if it isn't a TT hit, this will probably be something like 50-70%. Means, in all those cases, sorting the full move list is unnecessary because you're only going to use the topmost move as per your pre-sorting scheme (e.g. MVV/LVA, SEE or whatever).

I'm going through the move list, swapping the "best" move to the top, and try that. Only if that doesn't give a cutoff, I sort the rest of the list to avoid possible O(n^2) degradation. Note also that shellsort can be faster than quicksort for short lists like the move list of a chess engine.

That could about halve the time spent in sorting.
ok but what is expensive is not the sort algorithm itself, it is more the computation of the move score. So if I need to now which one is best (after the TT one), I need to score them all (at least captures ...)
The only scoring effort that can have performance impact is most probably SEE. Do you apply SEE to all captures, or only to "losing" captures?

How would know that a capture is "losing" without calculating its SEE ?

Edit: Just saw your answer on the other post. Yes, of course, I do that, it's a basic SEE optimisation.

Joost Buijs · Post by **Joost Buijs** » Sun Nov 10, 2019 7:54 am

xr_a_y wrote: ↑Sat Nov 09, 2019 11:25 am Minic is twice as slow as others 2800-3000 elo engines.
Minic is copy-make (not make unmake) and use HQBB not magic.
Moreover, Minic is sorting all moves as soon as TT move is not a beta cut-off, and this is quite slow due to detection of bad capture (SEE).

When profiling it, eval is the first consuming spot, with 15%, then is piece movement generation (for move generation and attack) with 12% and then comes move sorting with 10%. Then are things like use of TT, pvs and qsearch, ...

I tried profiling with linux perf, gprof, with visual studio, but it feels like results are not totally "real" to me.

Being twice as fast at this level is around what ? 45elo ? (this is what I get with 2 threads).
Do you think I shall go for make-unmake and magic BB or try to improve move ordering and pruning with stuff like counter move history?

Nightmare uses partial copy-make and make-unmake, for things that are small in size and time consuming to compute (like the hashes etc.) I use copy-make for other things (like updating the piece array) I use make-unmake.

I sort captures in exactly in the same way as you, if the TT-move doesn't produce a beta cutoff I score all captures with SEE and sort them in 2 stages, first I sort all winning and equal captures, the losing captures are sorted at the end (if all the other moves didn't produce a beta cutoff). In quiescence I sort all captures at once (winning and losing).

Using MVV/LVA for the first capture after the TT-move helps a tiny bit, but this doesn't make a world of a difference, in fact I don't use it at all at the moment. Nightmare on a single core does between 2.5 and 4 mnps (depending upon the stage of the game), I don't know how fast others are but this doesn't seem slow to me.

I measured with __rdtsc() how many clock cycles SEE takes on average, in my case this is ~38 cycles (10 nsec on my computer), if I add the MVV/LVA trick this drops to an average of ~36 cycles, this is hardly noticeable in practice.

I don't know what HQBB is, I use pext() (or magics when I have to run on an AMD processor).

Usually I see a larger improvement with speed doubling, something like 60 Elo, it can depend upon the engine though, I have the impression that the more I prune the less influence speed has.

If your engine is twice as slow as others you should work on that, improving the branching factor will help of course, but if you can manage to double the speed it will give you another 60 Elo.

Minic raw speed

Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed

Re: Minic raw speed