In my experience, extra registers are not very important, as shuttling registers to and from level-1 cached stack is very fast and occurs in parallel with other execution. OTOH, overflowing the cache has a very severe impact on performance.
I noted this when I looked at gcc-compiled code of qperft, which seemed to make very inefficient use of the register set. By hand-optimizing the assembly code, I could eliminate 40% (!) of all instructions in the time-critical section of the code (all loads and stores, eliminated by more clever register allocation, thus simulating a larger register set.) The optimized code was exactly 0% faster than the original...
How much longer can 32 bit Chess Engines survive?
Moderator: Ras
-
- Posts: 28389
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller