syzygy wrote:
Perft speed is pretty much irrelevant. Komodo, at least several versions ago, was said to use copy/make. I don't know perft results for Komodo.
I would say it's relevant as a benchmark for move generation/apply/unapply. As long as no one is doing bulk counting, hashing, or other perft-specific optimization.
Each internal node would be exactly 1 full move generation, 1 apply, and 1 un-apply. Why is that not relevant?
SF uses copy/make for part of the state.
I think there's some misunderstanding here.
My definition (and Bob's definition, too, judging by the thread you linked) of copy/make is copying the entire position state (~200 bytes in most engines) before making a move.
Copy/make for part of the state is not what I would call copy/make.
It's what I am doing with my engine, too. I only copy the changed fields, which is about 30 bytes on average.
With Crafty 24.0, on my machine, I get about 32Mnps perft (31ns per node), and 360ns per node in "analyze" from start position.
So move generation/apply/unapply only accounted for less than 10% of each node's time. A 10% slowdown in total NPS would mean generation/apply/unapply is taking more than 2x as long.
Another thread:
http://talkchess.com/forum/viewtopic.php?t=39938
A pure make/unmake might have the advantage that all state is in one struct pointed to by one register. With a hybrid approach as in my engine (but also in SF), the state is divided over two structs. This might speak for saving the value of the hash key during make and restoring it during unmake instead of creating the new value in the next element of a state array and using the value in that next element. But I doubt that it would be faster.
On x86-64 (unlike x86-32), there is a crapload of registers. I don't think register pressure would really be a problem on x86-64.
With my engine, on each apply, I record the list of changed fields and their old value. That minimizes copying at the expense of some extra logic, while being not nearly as complicated as make/unmake with absolutely minimal information. It seems to be reasonably fast as well (about 2/3 as fast as Crafty, without any serious optimization effort).
Disclosure: I work for DeepMind on the AlphaZero project, but everything I say here is personal opinion and does not reflect the views of DeepMind / Alphabet.