Compare your engine's performance: Corrections +++

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Compare your engine's performance: Corrections +++

Post by hgm »

lithander wrote: Thu Aug 26, 2021 5:28 pm
hgm wrote: Thu Aug 26, 2021 11:36 am My favorite methods for sorting are all O(n). This works by mapping all possible sort keys to bits in a word in the desired order, set the bits for the moves that are present, and then use bit extraction (BSF) to extract the moves.
I don't understand this. BSF gives you the index of the least significant bit? The words are probably 64bit integer? But how does that help with extracting the right move from a list of moves?
The idea is this: you create an auxiliary table to map the bit number to an index in the move list. E.g. for an MVV/LVA sort you could have a 2d array sortKey[attackerType][victimType], that contains a unique number for each combination (sortKey[PAWN][QUEEN] = 0, sortKey[MINOR][QUEEN]=1, ...).

Code: Select all

uint64_t moveSet = 0;
for(i=0; i<nrOfMoves; i++) {
  Move m = moveList[i];
  int k = sortKey[m.piece][m.victim];
  moveSet |= 1<<k;
  ptr[k] = i;
}
After that you do BSF, and if you extract bit number k, ptr[k] gives you the index of the move you must search next (i.e. move = moveList[ptr[BSF(moveSet)]]).

Real life is a bit more complex, because when you really do this by piece type you might have multiple moves with the same sortKey. And when you assign all pieces a unique number (like you would do when you use mailbox board + piece list) you would need 16x16 = 256 different sortKeys, so that moveSet no longer fits in a single word.

In the latter approach you could have an array uint64_t moveSet[4], and set the bit through moveSet[k>>6] |= 1ULL<<(k & 63). The bit extraction would then have to occur inside a loop over the four moveSet[] elements.

If there can be multiple moves with the same key (like when you would use this to sort non-captures by history, where you bin the history scores into narrow bins and use the bin number as sort key), you can put the moves into a linked list. I.e. put an extra field m.next in the move struct, which is initialized to an number that is not a valid index into the moveList (e.g. 255). You would then 'sort' the move by

Code: Select all

  moveList[i].next = ptr[k];
  ptr[k] = i;
Then the bit extraction would need an inner loop to step through the linked list of moves with equal sort key:

Code: Select all

n = ptr[k];
do {
  move = moveList[n];
  // search move
} while((n = move.next) != 255);
User avatar
lithander
Posts: 881
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Compare your engine's performance: Corrections +++

Post by lithander »

That's clever!

And it's considered O(n) because BSF has constant time complexity? But only if you don't need more sortKeys than you have bits. Looping over multiple words or iterating over linked lists makes the algorithm scale a worse than linear.

But I can see how this can work well especially in the domain of chess programming. Very cool!
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Compare your engine's performance: Corrections +++

Post by hgm »

lithander wrote: Fri Aug 27, 2021 3:06 pm And it's considered O(n) because BSF has constant time complexity? But only if you don't need more sortKeys than you have bits. Looping over multiple words or iterating over linked lists makes the algorithm scale a worse than linear.
No, it remains O(n). The number of words in the moveSet you need to loop over is proportional to n, if you keep the number of moves per word fixed.

Only when you work with the linked lists, and need to sort within the bin because multiple sort keys map into the same bin, this would drive up the order. But only if the average number of moves per bin (i.e. the length of the linked lists) would increase with n.

If you wanted to sort by history for a game with a thoudsand non-captures in a typical position, you could use 6400 bins, each 0.2% wide. (So that in total you span a history range of nearly 6 orders of magnitude.) Each word of moveSet would then on average contain 10 moves, so the overhead of stepping to the next word of the set is distributed over 10 moves. Most bins would contain a single move, 1 in 15 bins would contain a linked list of 2 moves. (With a potential difference in history score so small that you wouldn't bother to sort within the list.)
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Compare your engine's performance: Corrections +++

Post by klx »

Thanks for sharing, I'm wondering if a simple insertion/selection sort would not be faster when it comes to history score sorting, but would be interesting to try.
hgm wrote: Thu Aug 26, 2021 8:00 pm If there can be multiple moves with the same key (like when you would use this to sort non-captures by history, where you bin the history scores into narrow bins and use the bin number as sort key)
I'm not sure how well this binning would work in practice, I imagine the scores will not exactly be uniformly distributed, so many would end with the same key. Also would you have to resort to expensive division to ensure number of bins?
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
hgm
Posts: 27859
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Compare your engine's performance: Corrections +++

Post by hgm »

My favorit scheme would keep track of the history score in floating-point format. While updating the history table it would also keep track of the maximum value in this table. When you would read the floating point numbers as integers, this is a reasonable approximation for the (base 2) logarithm. What I would do is subtract the history value (as int) from the maximum, and shift right to retain a number of the high bits. How many bits you preserve determines the width of the bins on the logarithmic scale. (Which you can then tune in steps of a factor 2, which should be satisfactory; if not you would need an extra multiplication before clipping off the low bits by the shift.)

If many moves end up with the same key, (or with nearlly the same key, so they end up in the same bin), this just means the linked lists for that key get quite long. But that doesn't require more time per move to iterate through them. It would only be a problem if the order of the moves within the list matters: then you would need additional sorting within the list, which will no doubt scale unvaforably with n). But if they truly have the same key, the order would not matter. And if the difference of history score is significant, it just means your bins were too wide. But I expect history scores to be a very 'noisy' measure for the probability a move would produce a cutoff.
R. Tomasi
Posts: 307
Joined: Wed Sep 01, 2021 4:08 pm
Location: Germany
Full name: Roland Tomasi

Re: Compare your engine's performance: Corrections +++

Post by R. Tomasi »

Code: Select all

ver

Pygmalion ver. 3.0
playing no game (frontend only)

set-fen 5k2/ppp2r1p/2p2ppP/8/2Q5/2P1bN2/PP4P1/1K1R4/ w - -

New position: 5k2/ppp2r1p/2p2ppP/8/2Q5/2P1bN2/PP4P1/1K1R4 w - -

debug-board

8|.....k..
7|ppp..r.p
6|..p..ppP
5|........
4|..Q.....
3|..P.bN..
2|PP....P.
1|.K.R....
-+--------
 |abcdefgh

____

Material: +7

Hash: 3db27abd86d99d79

Signature: 9|10||1|1|2|1|12|2

Player + is on the move.

debug-pvs 20

0:     +7.05469 - Qxf7
  5.00 N   in   48.0 mcs =>    104 kN/s

1:     +7.07812 - Rd8
   201 N   in    208 mcs =>    964 kN/s

2:     +8.04297 - Qb4 Re7 Qxb7
   379 N   in    302 mcs =>   1.26 MN/s

3:     +8.04297 - Qb4 Re7 Qxb7
  2.06 kN  in   1.21 ms  =>   1.70 MN/s

4:     +8.07031 - Qb4 c5 Qxb7 c4
  14.2 kN  in   8.66 ms  =>   1.64 MN/s

5:     +8.08594 - Qb4 c5 Qxb7 Bxh6 Qxa7
  57.9 kN  in   26.3 ms  =>   2.20 MN/s

6:     +10.1094 - Rd8 Ke7 Qd3 f5 Rd7 Kf6 Rxf7 Kxf7 Qxe3
   247 kN  in    117 ms  =>   2.11 MN/s

7:          +M5 - Rd8 Ke7 Qd3 b5 Qd7
  1.08 MN  in    467 ms  =>   2.32 MN/s

8:     +10.1152 - Rd8 Ke7 Qd3 f5 Ne5 Rf8 Rxf8 Kxf8 Qxe3
  5.74 MN  in   2.68 s   =>   2.14 MN/s

9:     +10.1152 - Rd8 Ke7 Qd3 f5 Ne5 Rf8 Rxf8 Kxf8 Qxe3
  27.8 MN  in   12.2 s   =>   2.27 MN/s

10:     +12.0977 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
  81.7 MN  in   37.4 s   =>   2.19 MN/s

11:            0 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
  47.1 MN  in   14.5 s   =>   3.24 MN/s

12:            0 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
  60.2 MN  in   20.6 s   =>   2.93 MN/s

13:            0 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
  74.5 MN  in   25.8 s   =>   2.88 MN/s

14:            0 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
  91.1 MN  in   31.9 s   =>   2.86 MN/s

15:            0 - Rd8 Ke7 Rc8 Rf8 Qe4 Kf7 Rxc7 Kg8 Qxe3 Rf7 Rxf7 Kxf7 Qxa7
   112 MN  in   39.5 s   =>   2.85 MN/s

16:            0 - Rh1 Re7 Qe6 Bxh6 Qxf6 Rf7 Qh8 Ke7 Rxh6 Rf8 Rxh7
   210 MN  in   86.1 s   =>   2.44 MN/s

17:            0 - Rh1 Re7 Qe6 Bxh6 Qxf6 Rf7 Qh8 Ke7 Rxh6 Rf8 Rxh7
   328 MN  in    132 s   =>   2.48 MN/s

18:            0 - a4 Re7 Qe6 Rxe6 Rd8 Re8 Nd4 Ke7 Rxe8 Kxe8
   195 MN  in   71.1 s   =>   2.74 MN/s

19:            0 - a3 Re7 Qe6 Rxe6 Rd8 Re8 Nd4 Rxd8 Ne6 Ke7 Nxd8 Kxd8
   240 MN  in   87.1 s   =>   2.76 MN/s

20:            0 - Nh4 Re7 Qb3 Re8 Rd8 Rxd8 Qxb7 Ke8 Qxc7 Rd1
   453 MN  in    187 s   =>   2.42 MN/s

:cry:
Pygmalion in its current state does not find the mate at all with a search depth of 20. This is a very interesting position, though: obviously the output here hints at a bug in my search, judging from the +M5 score at depth 7...

Well, guess there's still some work to be done :oops: