What is the best known speed of move generation?

Discussion of chess software programming and technical issues.

Moderator: Ras

Aleks Peshkov
Posts: 951
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: What is the best known speed of move generation?

Post by Aleks Peshkov »

Gerd Isenberg wrote:The counter paradigma - if you can get something en-passant as a waste-product, "virtually" for free, take it - even if it is only used in let say 30% of the nodes. The idea is to re-use it for "better" move-ordering, more reliable evaluation, extension- and/or pruning decisions which are likely to reduce ebf.
Probably it is possible to keep pinning status cheap doing things incrementally, but implementations I know (Glaurung's bitboards and Muller's mailbox perft) do slow board scanning (similar to in-check function) at the beginning of each move-generation phase again and again.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What is the best known speed of move generation?

Post by hgm »

Note, however, that the board scanning is _only_ done after alignment is established between slider and King. Most of the time the effort stays limited to a scan through the slider list, (typically 5 pieces, and shrinking as the game progresses) testing for such alignment (and finding none).

I am not an expert on bitboard methods, but I imagine that the in-check test there would also involve some looping over piece types (orthogonal sliders, diagonal sliders, knights, pawns). Probably unrolled, but that doesn't matter too much.

The cost involved in looping through the slider list is so small, that I did not consider it worth trying to replace it by a differential scheme. It could be done of course, keeping a bitmap of pieces (one bit per piece) which are aligned with their opponent's King. Then you would only have to re-evaluate the status of the piece just moved, through the 0x88 test. You could extract the bits then by bitboard methods.

When there are bits extracted, the board scan would still be needed. In a bitboard engine you would probably replace that by generation of attacks sets centered on the King, based on its square mask and occupancy board, and intersect those with the corresponding slider set.

One could invert that, though: generate 'attacks sets' from the King-square and the relevant sliders, marking the rays connecting them. No alignments would give an all-zero bitboard, leading to an abort of the procedure. With alignment, intersecting the traced rays with the occupancy bitboard, and the proper magic, could then give you an index into a jump table that would discriminate the various events (in-check, blocked by a single piece, blocked by multiple pieces). As well as into tables of bitboards indicating the checker (or check ray), the pinned piece, etc.
Aleks Peshkov
Posts: 951
Joined: Sun Nov 19, 2006 9:16 pm
Location: Russia
Full name: Aleks Peshkov

Re: What is the best known speed of move generation?

Post by Aleks Peshkov »

hgm wrote:Note, however, that the board scanning is _only_ done after alignment is established between slider and King. Most of the time the effort stays limited to a scan through the slider list, (typically 5 pieces, and shrinking as the game progresses) testing for such alignment (and finding none).

The cost involved in looping through the slider list is so small, that I did not consider it worth trying to replace it by a differential scheme.
The cost is relatively small, but it is a same permanent small loop overhead with several table lookups again and again. It may (or may not dependent to compiler optimizations and other random factors) be slightly faster then pseudo-legal generation, but I cannot accept the given pin implementation as "tidy".
Gerd Isenberg
Posts: 2251
Joined: Wed Mar 08, 2006 8:47 pm
Location: Hattingen, Germany

Re: What is the best known speed of move generation?

Post by Gerd Isenberg »

hgm wrote:Note, however, that the board scanning is _only_ done after alignment is established between slider and King. Most of the time the effort stays limited to a scan through the slider list, (typically 5 pieces, and shrinking as the game progresses) testing for such alignment (and finding none).

I am not an expert on bitboard methods, but I imagine that the in-check test there would also involve some looping over piece types (orthogonal sliders, diagonal sliders, knights, pawns). Probably unrolled, but that doesn't matter too much.

The cost involved in looping through the slider list is so small, that I did not consider it worth trying to replace it by a differential scheme. It could be done of course, keeping a bitmap of pieces (one bit per piece) which are aligned with their opponent's King. Then you would only have to re-evaluate the status of the piece just moved, through the 0x88 test. You could extract the bits then by bitboard methods.

When there are bits extracted, the board scan would still be needed. In a bitboard engine you would probably replace that by generation of attacks sets centered on the King, based on its square mask and occupancy board, and intersect those with the corresponding slider set.

One could invert that, though: generate 'attacks sets' from the King-square and the relevant sliders, marking the rays connecting them. No alignments would give an all-zero bitboard, leading to an abort of the procedure. With alignment, intersecting the traced rays with the occupancy bitboard, and the proper magic, could then give you an index into a jump table that would discriminate the various events (in-check, blocked by a single piece, blocked by multiple pieces). As well as into tables of bitboards indicating the checker (or check ray), the pinned piece, etc.
With magic bitboards you get orthogonal checks by getting rook-attacks from king-square to intersect it with opposite rooks/queens. Dito with bishop-attacks and the diagonal sliders.

Code: Select all

r = rookAttacks(occupancy, kingSquare);
if ( r & oppositeRooksOrQueen )
   // in check
If you already have those rookwise attacks, one may reuse it, to get pinners. Remove own "attacked" potential pinned pieces from the occupancy and get the attack-set again:

Code: Select all

s = rookAttacks(occupancy ^ (r & ownPieces), kingSquare);
pinners = (s ^ r) & oppositeRooksOrQueen;
Thus for the pinners == 0 case - no loop at all. Otherwise you traverse the pinners to get the pinned piece(s) between king and pinner.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: What is the best known speed of move generation?

Post by bob »

hgm wrote:Note, however, that the board scanning is _only_ done after alignment is established between slider and King. Most of the time the effort stays limited to a scan through the slider list, (typically 5 pieces, and shrinking as the game progresses) testing for such alignment (and finding none).

I am not an expert on bitboard methods, but I imagine that the in-check test there would also involve some looping over piece types (orthogonal sliders, diagonal sliders, knights, pawns). Probably unrolled, but that doesn't matter too much.
Depends. chess 4.x and early versions of Crafty had two types of attack bitboards. One that gave attacks from any specific square for the piece on that square (incrementally updated as moves were made) and one that gave all squares attacking a specific square, also incrementally updated. This latter square was the key for fast detection of "in check"condition. If it is non-zero for the square your king stands on, when ANDed with all opponent pieces, your king is in check with a single AND operation...


The cost involved in looping through the slider list is so small, that I did not consider it worth trying to replace it by a differential scheme. It could be done of course, keeping a bitmap of pieces (one bit per piece) which are aligned with their opponent's King. Then you would only have to re-evaluate the status of the piece just moved, through the 0x88 test. You could extract the bits then by bitboard methods.

When there are bits extracted, the board scan would still be needed. In a bitboard engine you would probably replace that by generation of attacks sets centered on the King, based on its square mask and occupancy board, and intersect those with the corresponding slider set.

One could invert that, though: generate 'attacks sets' from the King-square and the relevant sliders, marking the rays connecting them. No alignments would give an all-zero bitboard, leading to an abort of the procedure. With alignment, intersecting the traced rays with the occupancy bitboard, and the proper magic, could then give you an index into a jump table that would discriminate the various events (in-check, blocked by a single piece, blocked by multiple pieces). As well as into tables of bitboards indicating the checker (or check ray), the pinned piece, etc.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: What is the best known speed of move generation?

Post by bob »

Gerd Isenberg wrote:
hgm wrote:Note, however, that the board scanning is _only_ done after alignment is established between slider and King. Most of the time the effort stays limited to a scan through the slider list, (typically 5 pieces, and shrinking as the game progresses) testing for such alignment (and finding none).

I am not an expert on bitboard methods, but I imagine that the in-check test there would also involve some looping over piece types (orthogonal sliders, diagonal sliders, knights, pawns). Probably unrolled, but that doesn't matter too much.

The cost involved in looping through the slider list is so small, that I did not consider it worth trying to replace it by a differential scheme. It could be done of course, keeping a bitmap of pieces (one bit per piece) which are aligned with their opponent's King. Then you would only have to re-evaluate the status of the piece just moved, through the 0x88 test. You could extract the bits then by bitboard methods.

When there are bits extracted, the board scan would still be needed. In a bitboard engine you would probably replace that by generation of attacks sets centered on the King, based on its square mask and occupancy board, and intersect those with the corresponding slider set.

One could invert that, though: generate 'attacks sets' from the King-square and the relevant sliders, marking the rays connecting them. No alignments would give an all-zero bitboard, leading to an abort of the procedure. With alignment, intersecting the traced rays with the occupancy bitboard, and the proper magic, could then give you an index into a jump table that would discriminate the various events (in-check, blocked by a single piece, blocked by multiple pieces). As well as into tables of bitboards indicating the checker (or check ray), the pinned piece, etc.
With magic bitboards you get orthogonal checks by getting rook-attacks from king-square to intersect it with opposite rooks/queens. Dito with bishop-attacks and the diagonal sliders.

Code: Select all

r = rookAttacks(occupancy, kingSquare);
if ( r & oppositeRooksOrQueen )
   // in check
If you already have those rookwise attacks, one may reuse it, to get pinners. Remove own "attacked" potential pinned pieces from the occupancy and get the attack-set again:

Code: Select all

s = rookAttacks(occupancy ^ (r & ownPieces), kingSquare);
pinners = (s ^ r) & oppositeRooksOrQueen;
Thus for the pinners == 0 case - no loop at all. Otherwise you traverse the pinners to get the pinned piece(s) between king and pinner.
One usual practice in bitboards it to care about the "set" but not about individual elements in the set... That is part of the mantra of "thinking bitboards". Swap() is really convenient (SEE algorithm) using bitboards for this very reason, as is the trivial ability to generate just captures without requiring loops to traverse all empty squares when doing this for sliders...
Jacob

Re: What is the best known speed of move generation?

Post by Jacob »

Just out of curiosity: can you run the same test for the following position?
rnb1kbnr/ppppqppp/8/8/8/8/PPPPQPPP/RNB1KBNR w KQkq -
This is from my pseudo-legal move generator, taking a count of all illegal moves that had to be unmade, except for when the king was in check to begin with (to more closely resemble Hyatt's move generator and his seperate GenerateEvasions() function)

Code: Select all

rnb1kbnr/ppppqppp/8/8/8/8/PPPPQPPP/RNB1KBNR w KQkq -
(Opening, both queens pinned)
 1.             24              8  33.33%
 2.            547            191  34.92%
 3.          13629           4416  32.40%
 4.         336781         104837  31.13%
 5.        9109498        2495648  27.40%
 6.      246238592       61779149  25.09%

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
(Start position)
 1.             20              0   0.00%
 2.            400              0   0.00%
 3.           8902              0   0.00%
 4.         197281            216   0.11%
 5.        4865609           4921   0.10%
 6.      119060324         310823   0.26%

r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -
(Opening, common perft position)
 1.             48              0   0.00%
 2.           2039              6   0.29%
 3.          97862            293   0.30%
 4.        4085603          24577   0.60%
 5.      193690690        1202281   0.62%

1r2k2r/8/8/8/8/8/8/R3K2R b KQk - 0 1
(Endgame, four rooks)
 1.             25              0   0.00%
 2.            567             14   2.47%
 3.          14095            436   3.09%
 4.         328965          13686   4.16%
 5.        8153719         335428   4.11%
 6.      195629489        9299379   4.75%

r1bq1rk1/pp2ppbp/2n3p1/2p5/2BPP3/2P1B3/P3NPPP/R2Q1RK1 b - -
(Late opening, Gruenfeld)
 1.             36              2   5.56%
 2.           1370              2   0.15%
 3.          49476           2052   4.15%
 4.        1910608           4596   0.24%
 5.       70699752        2287237   3.24%
Jacob

Re: What is the best known speed of move generation?

Post by Jacob »

Gerd Isenberg wrote: With magic bitboards you get orthogonal checks by getting rook-attacks from king-square to intersect it with opposite rooks/queens. Dito with bishop-attacks and the diagonal sliders.

Code: Select all

r = rookAttacks(occupancy, kingSquare);
if ( r & oppositeRooksOrQueen )
   // in check
If you already have those rookwise attacks, one may reuse it, to get pinners. Remove own "attacked" potential pinned pieces from the occupancy and get the attack-set again:

Code: Select all

s = rookAttacks(occupancy ^ (r & ownPieces), kingSquare);
pinners = (s ^ r) & oppositeRooksOrQueen;
Thus for the pinners == 0 case - no loop at all. Otherwise you traverse the pinners to get the pinned piece(s) between king and pinner.
I do something like this:

Code: Select all

sliders = (Rmagic(king[col], 0) & (pieces[ROOK|xcol] | pieces[QUEEN|xcol]))
		| (Bmagic(king[col], 0) & (pieces[BISHOP|xcol] | pieces[QUEEN|xcol]));
(The magic multiply is optimized away, though something like u64 bishops_rays[64] would be better).

Code: Select all

for (; sliders; ClearLSB(sliders)) {
		int from = LSB(sliders);

		pinned = squares_between[from][king[col]] & pieces[OCCU];

		// If it's neither a pin nor a check
		if (pinned & pieces[xcol] || pinned & (pinned - 1))
			continue;

		if (pinned) {
			pins[ply].pinned |= pinned;
			Set(from, pins[ply].pinner);
		}
		else status[ply].checkf += 2;
	}
The loop only executes if there's a possible pin or check, and there's usually very few pin/check candidates. I also get information like check type, and pinners (which is used in GenerateCaptures()/NonCaptures())
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: What is the best known speed of move generation?

Post by hgm »

Jacob wrote:

Code: Select all

rnb1kbnr/ppppqppp/8/8/8/8/PPPPQPPP/RNB1KBNR w KQkq -
(Opening, both queens pinned)
 1.             24              8  33.33%
 2.            547            191  34.92%
 3.          13629           4416  32.40%
 4.         336781         104837  31.13%
 5.        9109498        2495648  27.40%
 6.      246238592       61779149  25.09%
Wow! That is way more than I expected. I pinned both Queens so that I would not have to worry about the side with the pinned piece having the final ply, or not. But I guess that in this case they force each other to maintain the pin. Of course most moves would maintain the pin anyway, as the are with other pieces than the pinner.

Anyway, the example shows that the number of illegal moves need not always be small, and can be as high as 25%.

One important lesson is that pins behave different from checks. The latter must be resolved. But the very nature of a pin is that it is very difficult to resolve. Another lesson is that after 6 ply, the position in general still looks like the root, as each side could only move 3 of its 16 pieces. So if the root contains no pin, the leaves are unlikely to contain a pin. If the root does contain a pin, the pin will most likely still exist in the leaves.

I am a bit surprised about the end-game example. I cannot imagine many pins there. The illegal moves are probably mostly the King wandering into check. My perft would suffer from that too, btw, I have not devised a trick yet to suppress generation of illegal King moves.

It does
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: What is the best known speed of move generation?

Post by bob »

Jacob wrote:
Just out of curiosity: can you run the same test for the following position?
rnb1kbnr/ppppqppp/8/8/8/8/PPPPQPPP/RNB1KBNR w KQkq -
This is from my pseudo-legal move generator, taking a count of all illegal moves that had to be unmade, except for when the king was in check to begin with (to more closely resemble Hyatt's move generator and his seperate GenerateEvasions() function)

Code: Select all

rnb1kbnr/ppppqppp/8/8/8/8/PPPPQPPP/RNB1KBNR w KQkq -
(Opening, both queens pinned)
 1.             24              8  33.33%
 2.            547            191  34.92%
 3.          13629           4416  32.40%
 4.         336781         104837  31.13%
 5.        9109498        2495648  27.40%
 6.      246238592       61779149  25.09%

rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq -
(Start position)
 1.             20              0   0.00%
 2.            400              0   0.00%
 3.           8902              0   0.00%
 4.         197281            216   0.11%
 5.        4865609           4921   0.10%
 6.      119060324         310823   0.26%

r3k2r/p1ppqpb1/bn2pnp1/3PN3/1p2P3/2N2Q1p/PPPBBPPP/R3K2R w KQkq -
(Opening, common perft position)
 1.             48              0   0.00%
 2.           2039              6   0.29%
 3.          97862            293   0.30%
 4.        4085603          24577   0.60%
 5.      193690690        1202281   0.62%

1r2k2r/8/8/8/8/8/8/R3K2R b KQk - 0 1
(Endgame, four rooks)
 1.             25              0   0.00%
 2.            567             14   2.47%
 3.          14095            436   3.09%
 4.         328965          13686   4.16%
 5.        8153719         335428   4.11%
 6.      195629489        9299379   4.75%

r1bq1rk1/pp2ppbp/2n3p1/2p5/2BPP3/2P1B3/P3NPPP/R2Q1RK1 b - -
(Late opening, Gruenfeld)
 1.             36              2   5.56%
 2.           1370              2   0.15%
 3.          49476           2052   4.15%
 4.        1910608           4596   0.24%
 5.       70699752        2287237   3.24%
I think one important point is that in perft things are a bit different than in a real search. My evaluation includes mobility, and so hates absolute pins. Crafty tends to break them as quickly as possible which prevents so many from being actually generated and searched later. perft just stomps thru the depth-first tree looking at all possible positions, rather than the positions that actually get searched...