bob wrote:First idea, why not get rid of the negative scores completely? Negative numbers embedded like that are going to be an issue. If you make everything positive, then you don't worry about 2's complement stuff letting the lower order half wreck the high-order half.
Whether this is a good idea at all is a good question. Extra adds are not bad. You rarely keep multiple pipelines busy anyway...
What do you mean? Like having two counters, one for each side, and each is only incremented? So if one side gets a penalty, it is just added to the other side? That could work, but seems a bit messy to me.
One other possibility: since the scores are going to be in a certain range, you could use maybe 12 bits instead of 16 to represent the score. Since adding negative numbers in two's complement results in overflows that we need to get rid of, the extra bits gained out of 32 would be in the middle, and could be used for overflow space for the lower score (and would be ignored). So one score is &0xFFF, the other is >>20. To add -1 to the lower score, you add 0xFFF. This generates a carry into 0x1000, but we can add up to 512 negative numbers without the carries reaching the upper score.
No idea, short in time. You may still fall back to the working version, may be better with the conditional ac++ to avoid the second mul - otherwise with all the other overhead one may better use two distinct multiplications ac and bd. This is the intended (untested) assembly with scaling down by 16.
bob wrote:First idea, why not get rid of the negative scores completely? Negative numbers embedded like that are going to be an issue. If you make everything positive, then you don't worry about 2's complement stuff letting the lower order half wreck the high-order half.
Whether this is a good idea at all is a good question. Extra adds are not bad. You rarely keep multiple pipelines busy anyway...
What do you mean? Like having two counters, one for each side, and each is only incremented? So if one side gets a penalty, it is just added to the other side? That could work, but seems a bit messy to me.
One other possibility: since the scores are going to be in a certain range, you could use maybe 12 bits instead of 16 to represent the score. Since adding negative numbers in two's complement results in overflows that we need to get rid of, the extra bits gained out of 32 would be in the middle, and could be used for overflow space for the lower score (and would be ignored). So one score is &0xFFF, the other is >>20. To add -1 to the lower score, you add 0xFFF. This generates a carry into 0x1000, but we can add up to 512 negative numbers without the carries reaching the upper score.
I was thinking of two different alternatives.
A penalty for one side is a bonus for the other, yes. But this still leads to negative numbers since the two scores being combined are EG and MG values.
1. I was thinking about the idea I use in my hash table scores. I simply add a large enough value to guarantee that the stored score is always positive (sort of like IEEE floating point does for the exponent). I subtract the bias before actually using one of the scores.
2. One could use two 31 bit values. This leaves the 32nd bit as unused. You can then construct constants that will effectively "subtract" and cause an overflow into that unused bit, which promptly gets ANDed back to zero so that it can't overflow into the upper half of the 64 bit value, which would screw up that value.
However, I don't think the idea is worth anything, since it is hard enough to keep the functional units inside the CPU busy. An extra add that is completely independent of the first add won't add a single clock cycle to the time required to complete the computation. It just gets slipped into another pipe and is done in parallel with the first add.