hgm wrote:It is not obvious to me that a jump table would be faster than a well-designed jump tree. It depends on the details of the misprediction logic, which heavily depends on CPU type. In general, the misprediction logic will be able to hold more information of a tree of branches than on a single indirect jump. This might help to get a lower misprediction rate on the tree.
Suppose the target is truly unpredictable. Then you have log2(n) branches that each have a 50% chance of being mispredicted, leading to on average log2(n)/2 mispredictions. Compare that with the indirect branch, which has 1 misprediction. All else being equal, the indirect branch will be faster for n > 4 or so. However, if the target is somewhat predictable, then the conditional branches will do better because they will sometimes predict correctly and the proper instructions will be fetched and speculatively executed. The indirect branch will still be mispredicted nearly all of the time though. If the branch is extremely predictable, both solutions should do well but again the indirect branch is likely to be better for n > 4.
bob wrote:Also a jump into a jump table has to be predicted even though it is unconditional, because the target address is unknown at fetch time, and that is going to add additional penalty when the unconditional jump is predicted to go to the wrong address...
So that's a good point which suggests that "all else is not equal" in my example above. Both potential targets of the conditional branch are known as it decodes the instruction, whereas with the indirect branch there may be some latency before the target is actually known (which could cause some delay in getting those instructions fetched). I'm guessing that for n >= 8 the indirect branch would still be better.
By the way, I recall in the days of the Pentium II/III that Intel advised one of my co-workers that if you load the target address into a register at least 40 cycles before you jump or call through the register, it will basically be free (because it will be predicted to go to that address). I've never noticed a compiler actually doing this, but I thought it was interesting.