Many thanks for the help!
The stm not flipped thing... I am not sure how that happened 

. It's not the only problem, though. Playing strength is still greatly reduced with that fixed.
I am aware that I don't need to do a full make/unmake, but I am trying to get it right first then optimize it.
3. There is no backup mechanism. While the cutoff conditions look correct, you need to take into account the fact that the side to move doesn't have to capture. You need a stack of values, indexed by moves_to_undo, and at the end only back up a value if it's greater than or equal to the current value. This is a bit more complicated because you don't use negamax.
Can you please elaborate on that? I thought the moving side only won't make a capture if and only if the score is in its favor?
You are not supposed to have that nps drop ; your code is not fast enought or you are doing see() too often. Some engine are using see() to score captures only for captures with value(attacker) > value(victim) like Queen x Pawn. 
I am already doing that, as per Dr. Hyatt's suggestion to someone else in another thread.
Your code (see or search) is buggy if you can not prune negative see() captures in QSearch. 
That I am aware of 

.
I don't know if your applyMove() and undoMove() are your standard move code or a special version for see() and i don't know if your genSmallestCapture() generate legal move or not ; perhaps some of those function can not handle correctly illegal position (king in check, etc).
applyMove and undoMove are my standard move code. genSmallestCapture actually doesn't check for position legality, and I guess that could be the problem. I will look into that. 
which retrieves the next captured piece from your mailbox board, I guess that your applyMove() and undoMove() functions are really making/unmaking the move on the board, is that right? In that case I would think this may be way too much overhead for the small goal of getting the next victim (depends on the amount of other things you do in applyMove(), of course). In addition to that, you also generate the smallest possible capture to the dest square after each applyMove() which "sounds" quite expensive, too.
I agree it's unnecessarily expensive. I am just trying to get it to work correctly first before trying to optimize it.
to collect all attackers and defenders of the destination square, and put these into two separate lists. 
That I tried to do, but have trouble with discovered attackers.
- How do you handle a king capturing a defended piece? 
my genSmallestCapture() doesn't generate captures by the king. I thought it would be a small inaccuracy due to the king not usually involved in capture sequences.
I will do the test positions analysis, but I think this is my biggest misunderstanding now -
There is no backup mechanism. While the cutoff conditions look correct, you need to take into account the fact that the side to move doesn't have to capture. You need a stack of values, indexed by moves_to_undo, and at the end only back up a value if it's greater than or equal to the current value. This is a bit more complicated because you don't use negamax.
Thanks!