I started four engines so all memory was allocated but they were idle. then i threw the first one into analyze mode, waited 30 seconds, threw the second into analyze mode, waited 30 seconds, etc.
First three are essentially identical -- the second one actually ran marginally faster than the first. memory alignment, OS overhead, I don't know. Fourth one is about 2-3% slower.
After the first four, I figured I should test a 5th just to verify that it would in fact suck to run 5 engines simultaneously on a 4 core box. It came out about 87% slower than the average of the first four.
And because my life isn't complete if I don't graph everything I can get my hands on:
[/img]