After you generate each move, you assign a score to them as well. After that you simply pick the best move using std::max_element, swap the last move from the move list with this best move, and decrease the size of the move list by 1.
thanks, everything is more clear now.
also, after reading a bit of code from olithink https://github.com/olithink/OliThink/bl ... ink.c#L987, i understand why we generate the moves separately. if we get a beta cutoff when trying captures we completely avoid the generation of the other moves
tcusr wrote: ↑Wed Nov 17, 2021 8:29 pm
alright thanks.
but i still don't get why i should generate the quiets and captures separately, is it because it's more convenient to score them? https://github.com/maksimKorzh/chess_pr ... rch.c#L263 if use this method to sort the next move like you said it doesn't matter if the move is right after our current index or is the last one of the list, it still needs to go to the end of the list
Well, the captures usually need different treatment that the non-captures. E.g. their sort key would be calculated in a different way (e.g. MVV/LVA versus history). And if you prevent them from getting mixed in the piece list, you don't have to search for captures in the entire list. If you generate captures and non-captures in different loops, you can tailor those loops for what you want done with them. If they are treated in the same loop, you would have to figure out whether you are dealing with a capture or non-capture for each move individually.
tcusr wrote: ↑Wed Nov 17, 2021 8:29 pm
alright thanks.
but i still don't get why i should generate the quiets and captures separately, is it because it's more convenient to score them? https://github.com/maksimKorzh/chess_pr ... rch.c#L263 if use this method to sort the next move like you said it doesn't matter if the move is right after our current index or is the last one of the list, it still needs to go to the end of the list
Well, the captures usually need different treatment that the non-captures. E.g. their sort key would be calculated in a different way (e.g. MVV/LVA versus history). And if you prevent them from getting mixed in the piece list, you don't have to search for captures in the entire list. If you generate captures and non-captures in different loops, you can tailor those loops for what you want done with them. If they are treated in the same loop, you would have to figure out whether you are dealing with a capture or non-capture for each move individually.
thanks, you've cleared another doubt. one of the engines i use as a reference does exactly what you say (https://github.com/algerbrex/blunder/bl ... ch.go#L568) but i suppose this is a matter of preference to keep things simple.
Note that the method for sorting the captures that I used in the mailbox trials can also be used in less advanced designs, to save on sorting time. You just take two static tables attackerTable[256], victimTable[256], which you initialize with all possible (attacker, victim) combinations, in MVV/LVA order. Each capture can then be represented by the index in those tables. You then associate each index value with a bit in a 256-bit data set, which initially is cleared. For each capture you generate, you set the corresponding bit to 1, rather than writing it in some move-list array. You can then use a bit-extraction loop for retrieving the captures without further sorting.
tcusr wrote: ↑Wed Nov 17, 2021 8:57 pm
also, after reading a bit of code from olithink https://github.com/olithink/OliThink/bl ... ink.c#L987, i understand why we generate the moves separately. if we get a beta cutoff when trying captures we completely avoid the generation of the other moves
Very well said.
First (1) the move from Hash/TTTable will be used. If this gets a bet cutoff, there is no need to generate more moves.
Second (2) noisy moves (typically captures and promotions are being generated). They often get cutoffs, so the generation of
Third(3) quiet moves is not necessary.
tcusr wrote: ↑Wed Nov 17, 2021 8:57 pm
also, after reading a bit of code from olithink https://github.com/olithink/OliThink/bl ... ink.c#L987, i understand why we generate the moves separately. if we get a beta cutoff when trying captures we completely avoid the generation of the other moves
Very well said.
First (1) the move from Hash/TTTable will be used. If this gets a bet cutoff, there is no need to generate more moves.
Second (2) noisy moves (typically captures and promotions are being generated). They often get cutoffs, so the generation of
Third(3) quiet moves is not necessary.
my idea is only fast for the generation of quiet moves, i guess it's time abandon it now and go back to magic bitboards
Indeed, simple mailbox is very fast for generating non-captures, but when you want to generate only captures it is hardly faster than generating all moves. Sliders move in such a way that you basically get all non-captures as a free side effect when traveling towards the piece to be captured. The problem of course is that in an engine you are only interested in generating captures in about 85% of the nodes, because these are QS nodes.
This problem can of course be solved through the use of a neighbor table. Then generation of a slider capture doesn't need to examine any intermediate square along the ray, but directly goes to the target through
You would still have to test whether there is an enemy piece on the toSquare, though: it could also be a friendly piece, or off board. But at least for generating captures by a rook you would only have to examine 4 squares, no matter how far away those targets are. The price is of course that you have to update the neighbor table before you start to generate moves. But on a capture (~95% of all moves!) that is not very hard, as only a single square gets evacuated for those. Then you only have to make the neighbors of that evacuated square refer to each other.
hgm wrote: ↑Fri Nov 19, 2021 10:15 am
Indeed, simple mailbox is very fast for generating non-captures, but when you want to generate only captures it is hardly faster than generating all moves. Sliders move in such a way that you basically get all non-captures as a free side effect when traveling towards the piece to be captured. The problem of course is that in an engine you are only interested in generating captures in about 85% of the nodes, because these are QS nodes.
This problem can of course be solved through the use of a neighbor table. Then generation of a slider capture doesn't need to examine any intermediate square along the ray, but directly goes to the target through
You would still have to test whether there is an enemy piece on the toSquare, though: it could also be a friendly piece, or off board. But at least for generating captures by a rook you would only have to examine 4 squares, no matter how far away those targets are. The price is of course that you have to update the neighbor table before you start to generate moves. But on a capture (~95% of all moves!) that is not very hard, as only a single square gets evacuated for those. Then you only have to make the neighbors of that evacuated square refer to each other.
The incremental nature of chess makes sliding lookup a 100% nop. I have a version where the Seemap and Attackmap are different. On normal move only ever 2 squares change - and you only need to update the seemap+atkmap of the pieces that see those squares.
That is slower for a full movegen - but for 95% attacks it could be very much faster.
An interesting side effect is that pawns that are stuck dont even need a single instruction wasted on them to confirm they are still stuck.
The "sensitivity list" of a pawn and all other pieces never gets activated when the change is not visible to them.
hgm wrote: ↑Fri Nov 19, 2021 10:15 am
Indeed, simple mailbox is very fast for generating non-captures, but when you want to generate only captures it is hardly faster than generating all moves. Sliders move in such a way that you basically get all non-captures as a free side effect when traveling towards the piece to be captured. The problem of course is that in an engine you are only interested in generating captures in about 85% of the nodes, because these are QS nodes.
This problem can of course be solved through the use of a neighbor table. Then generation of a slider capture doesn't need to examine any intermediate square along the ray, but directly goes to the target through
You would still have to test whether there is an enemy piece on the toSquare, though: it could also be a friendly piece, or off board. But at least for generating captures by a rook you would only have to examine 4 squares, no matter how far away those targets are. The price is of course that you have to update the neighbor table before you start to generate moves. But on a capture (~95% of all moves!) that is not very hard, as only a single square gets evacuated for those. Then you only have to make the neighbors of that evacuated square refer to each other.
i really like this idea but i swear i can't wrap my head on how to make it work, especially with bitboards
btw i think the solution has been hiding under my nose all along. in the first post of this thread i linked a thread where a guy wrote a function to get sliding attacks on the fly that was as fast as kindergarten bitboards. olithink uses kindergarten bitboards and is rated 2900. this is proof that it's fast enough
i modified the code bit to use lookup tables instead of calculating the rank/file masks on the fly, they are small so it's not a problem
#include <cstdint>
// kindergarten bitboards from olithink
#define u64 uint64_t
#define RATT1(f) rays[(f << 6) | key000(BOARD, f)]
#define RATT2(f) rays[(f << 6) | key090(BOARD, f) | 0x1000]
inline int key000(u64 b, int f) {
return (int) ((b >> ((f & 56) + 1)) & 0x3F);
}
inline int key090(u64 b, int f) {
u64 _b = (b >> (f&7)) & 0x0101010101010101LL;
return (int)((_b * 0x0080402010080400LL) >> 58);
}
extern u64 rays[], BOARD;
u64 ratt(u64 b, int f) {
return RATT1(f) | RATT2(f);
}
// on the fly method
inline uint64_t slide_arithmetic(uint64_t piece, uint64_t line, uint64_t block) {
// mask blockers
block = block & ~piece & line;
// split the line into upper and lower rays
uint64_t bottom = piece-UINT64_C(1);
// for the upper part we can use the x^(x-1) trick to fill in from the bottom
uint64_t masked_up = block & ~bottom;
uint64_t blocked_up = (masked_up ^ (masked_up-UINT64_C(1)));
// for the bottom we use CLZ + a shift to fill in from the top
uint64_t masked_down = block & bottom;
uint64_t blocked_down = (UINT64_C(0x7FFFFFFFFFFFFFFF) >> __builtin_clzll(masked_down|UINT64_C(1)));
// the intersection of the two is the move set after masking with the line
return (blocked_up ^ blocked_down) & line;
}
extern u64 rookfile[], rookrank[];
uint64_t ratt2(int square, uint64_t blockers) {
return
slide_arithmetic(1ull<<square, rookfile[square], blockers) ^
slide_arithmetic(1ull<<square, rookrank[square], blockers);
}
Well, it is not meant to work with bitboards, because it is not needed for bitboards. There is no penalty there for generating both captures and non-captures when you need only the latter, as all moves are generated simultaneously. And by ANDing with the opponent bitboard you are left with the captures, and that takes only a single instruction. The neighbor table is just an aid to do efficient capture-only generation with a mailbox board representation.
There is hardly any relation between speed and strength, BTW. An engine can be 1000 times slower, and still 1000 Elo stronger.