Branch Prediction Efficiency

Discussion of chess software programming and technical issues.

Moderator: Ras

Sesse
Posts: 300
Joined: Mon Apr 30, 2018 11:51 pm

Re: Branch Prediction Efficiency

Post by Sesse »

poplsb() is presumably just finding the lowest bit (__builtin_ctzll(b) or similar intrinsic) and then removing it (b &= b-1).
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: Branch Prediction Efficiency

Post by dangi12012 »

Yes for optimal perfect prediction (essentially eliminating all branches)
you can introduce an integer template called config into movegen which will range from 0 to 3^8+1.

The template config when 6561 is for cases when there are more than 2 of a piece on the board. (so a promotion to 3rd queen) in which case you need the while loops.
All others are just the Knights, Rooks, Bishops, Queens in base 3 to represent 0,1,2 pieces. This will cover close to 100% of all positions.

You get the number of pawns for free. With this knowledge at compiletime you get very much information of the current board for efficient code. The templates will call towards 0 with less and less pieces.

Fun fact: all possible pawn configurations fit into a single integer as well (but range is too big for templating but small enough for memory). With the number of pawns + a single integer you can just lookup the bb for enemy pawns making the board representation a tad smaller.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
syzygy
Posts: 5695
Joined: Tue Feb 28, 2012 11:56 pm

Re: Branch Prediction Efficiency

Post by syzygy »

Mike Sherwin wrote: Wed Dec 28, 2022 2:55 pmSo everything positive over the last decade or so that I've read about how fast jump tables are compared to conditional jumps is bogus? Absolutely amazing!
If you MUST execute a switch on, say, 64 possible values it will be faster to implement the switch with a direct jump than to do 6 conditional branches. If all 64 values are equally probable, the probabilty of predicting everything right is 1 in 64 in both cases, but the direct jump will mispredict only once the other 63 out of 64 times.

However, if you can avoid the switch, then you win.

Your first method requires a switch for each piece. (OK, just on 6 values, not 64.)
Your second method does not require any switch.

The second method may mispredict a few more times on exiting the loops, but this is still better than a switch for each piece.
The first method might be slightly faster if you know there are only very few pieces.