Branch Prediction Efficiency
Moderator: Ras
-
- Posts: 300
- Joined: Mon Apr 30, 2018 11:51 pm
Re: Branch Prediction Efficiency
poplsb() is presumably just finding the lowest bit (__builtin_ctzll(b) or similar intrinsic) and then removing it (b &= b-1).
-
- Posts: 1062
- Joined: Tue Apr 28, 2020 10:03 pm
- Full name: Daniel Infuehr
Re: Branch Prediction Efficiency
Yes for optimal perfect prediction (essentially eliminating all branches)
you can introduce an integer template called config into movegen which will range from 0 to 3^8+1.
The template config when 6561 is for cases when there are more than 2 of a piece on the board. (so a promotion to 3rd queen) in which case you need the while loops.
All others are just the Knights, Rooks, Bishops, Queens in base 3 to represent 0,1,2 pieces. This will cover close to 100% of all positions.
You get the number of pawns for free. With this knowledge at compiletime you get very much information of the current board for efficient code. The templates will call towards 0 with less and less pieces.
Fun fact: all possible pawn configurations fit into a single integer as well (but range is too big for templating but small enough for memory). With the number of pawns + a single integer you can just lookup the bb for enemy pawns making the board representation a tad smaller.
you can introduce an integer template called config into movegen which will range from 0 to 3^8+1.
The template config when 6561 is for cases when there are more than 2 of a piece on the board. (so a promotion to 3rd queen) in which case you need the while loops.
All others are just the Knights, Rooks, Bishops, Queens in base 3 to represent 0,1,2 pieces. This will cover close to 100% of all positions.
You get the number of pawns for free. With this knowledge at compiletime you get very much information of the current board for efficient code. The templates will call towards 0 with less and less pieces.
Fun fact: all possible pawn configurations fit into a single integer as well (but range is too big for templating but small enough for memory). With the number of pawns + a single integer you can just lookup the bb for enemy pawns making the board representation a tad smaller.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Daniel Inführ - Software Developer
-
- Posts: 5695
- Joined: Tue Feb 28, 2012 11:56 pm
Re: Branch Prediction Efficiency
If you MUST execute a switch on, say, 64 possible values it will be faster to implement the switch with a direct jump than to do 6 conditional branches. If all 64 values are equally probable, the probabilty of predicting everything right is 1 in 64 in both cases, but the direct jump will mispredict only once the other 63 out of 64 times.Mike Sherwin wrote: ↑Wed Dec 28, 2022 2:55 pmSo everything positive over the last decade or so that I've read about how fast jump tables are compared to conditional jumps is bogus? Absolutely amazing!
However, if you can avoid the switch, then you win.
Your first method requires a switch for each piece. (OK, just on 6 values, not 64.)
Your second method does not require any switch.
The second method may mispredict a few more times on exiting the loops, but this is still better than a switch for each piece.
The first method might be slightly faster if you know there are only very few pieces.