hgm wrote:mhull wrote:Suppose a software cracker were to disassemble a chess program only to discover no chess program at all? Would he be able to detect that he was using a virtual machine parsing opcodes from an alien (or even notional) machine architecture known only to its creator?
I don't think this alters anything. What he would discover when he starts disassembling the binary is that there is only little code, (the VM emulator) and lots of 'data' (the Chess program). By inspecting the code he would quickly learn the architecture of the VM, and then write a disassembler for the interpreted code.
In fact this is exactly what people would find if they disassembled the binary of my engine Usurpator II. They would find an emulator, and if they are smart, they would actually recognize it as a 6502 emulator.
Well, the more obscure the architecture, the better. I remember in college, we studied assembler techniques using a fictional computer called "comp 2". Though it would be quite a problem to cross-compile to a fictional architecture, emulators for less well known architectures are around and gcc can cross-compile to many of them. The more obscure the emulator, the more unlikely the cracking of the program.
hgm wrote:But it throws up an extra hurdle. Unfortunately you would also suffer a major speed hit, (an order of magnitude or so), so for Chess engines it is not really an option. Encrypting the native machine language, and only decrypt selective parts of it when needed at run time might be a better solution. Then they would not be able to just disassemble the static binary, but be forced to run a debugger on it, and catch it in action. But it can still be done (e.g. causing a core dump when the program is in full action).
I'm still very much enamored of the idea of the VM performance hit, for reasons in addition to those already mentioned, it would have the benefit of further exposing the misleading nature of uniform platform rating lists, since the author is presenting a slower version than rating groups will ever have access to. Think about it, of what use are fast versions when they are already hacking off parts of AI cognition, i.e. MHz, books, egtbs, pondering, learning, entire-engines, half-words (32 bit versus 64), etc., and then publishing to the whole world that, "engine xyz-chess sucks by this Elo amount"? If everyone knew that the "native" version would be much more powerful by some unknown factor, it would restore mystery and anticipation to formal author-entry competitions.