Mpsey wrote:I use a standard bitboard representation for the time being and am getting a 24% increase in performance over my non-bitboard chess engine for generation of pseudo-legal moves on my x64 machine.
I am very unhappy with this result as I expected a much larger improvement in performance.
You're unhappy with a 24% speedup?!
Maybe it's just me, but I would normally expect much smaller improvements from anything I do...
I am asking if there are any benchmarks for typical performance gains with each board representation.
Performance gain over what? You can ask for raw perft results for different engines, which may tell you something - or not, because it depends a lot on implementation details and in the end doesn't say very much about how the thing will perform in real games.
What is a "standard bitboard representation", by the way?
I would like to mention that for generating raw move bitboards (move-target bitboards) my program
is impressively fast (In my opinion

), but there seems to be a bottleneck on move serialization.
Of course generating the move bitboards is fast; it's typically just one or two table lookups per piece, which depending on efficiency and test setup will already be in the cache.
How slow serialisation is depends a lot on how and when you do it. Do you use a hardware bitscan? If not, what alternatives have you tried? Can you defer the serialisation (I think there are ways to do that, but the code may become too ugly to deal with and actually perform worse)? Do you only generate legal evasions when in check?