Fore One Smarter Than Me- Don't Get Trampled In A Stampede!

mhull · Post by **mhull** » Sat Sep 03, 2011 10:13 pm

hgm wrote:
mhull wrote:Suppose a software cracker were to disassemble a chess program only to discover no chess program at all? Would he be able to detect that he was using a virtual machine parsing opcodes from an alien (or even notional) machine architecture known only to its creator?
I don't think this alters anything. What he would discover when he starts disassembling the binary is that there is only little code, (the VM emulator) and lots of 'data' (the Chess program). By inspecting the code he would quickly learn the architecture of the VM, and then write a disassembler for the interpreted code.

In fact this is exactly what people would find if they disassembled the binary of my engine Usurpator II. They would find an emulator, and if they are smart, they would actually recognize it as a 6502 emulator.

Well, the more obscure the architecture, the better. I remember in college, we studied assembler techniques using a fictional computer called "comp 2". Though it would be quite a problem to cross-compile to a fictional architecture, emulators for less well known architectures are around and gcc can cross-compile to many of them. The more obscure the emulator, the more unlikely the cracking of the program.

hgm wrote:But it throws up an extra hurdle. Unfortunately you would also suffer a major speed hit, (an order of magnitude or so), so for Chess engines it is not really an option. Encrypting the native machine language, and only decrypt selective parts of it when needed at run time might be a better solution. Then they would not be able to just disassemble the static binary, but be forced to run a debugger on it, and catch it in action. But it can still be done (e.g. causing a core dump when the program is in full action).

I'm still very much enamored of the idea of the VM performance hit, for reasons in addition to those already mentioned, it would have the benefit of further exposing the misleading nature of uniform platform rating lists, since the author is presenting a slower version than rating groups will ever have access to. Think about it, of what use are fast versions when they are already hacking off parts of AI cognition, i.e. MHz, books, egtbs, pondering, learning, entire-engines, half-words (32 bit versus 64), etc., and then publishing to the whole world that, "engine xyz-chess sucks by this Elo amount"? If everyone knew that the "native" version would be much more powerful by some unknown factor, it would restore mystery and anticipation to formal author-entry competitions.

mhull · Post by **mhull** » Sat Sep 03, 2011 10:22 pm

bob wrote:Same problem for a binary. A good CS person can take a binary and convert it to a working C program. In fact, there is software to do this (after a fashion) already. Not a thing that can be done, because at some point, the engine has to load into RAM in a normal format that will execute. And once that is done, a debugger can expose the entire thing, granted it is in assembly language that is very hard to read/understand without any symbols or procedure names, but that is a hindrance, not a barrier.

There might be a way to conceal the evaluation, at least for the purposes of a hostile person getting a leg up in development or competing under a false flag. Wherever a full eval would normally be done, substitute a trained neural network (NN). You might need to train x number of NNs for different numbers of extant piece combinations. But the advantage is you can't really reverse engineer an NN to find the real eval architecture.

If this would represent a slow down, you wouldn't necessarily care since this would be a distrubution strategy for private and commercial purposes. For competitions and chess servers, you would naturally reserve the fast version for yourself. In this way, end users can never effectively meet you in a competition with a copy of your own program.

bob · Post by **bob** » Sat Sep 03, 2011 10:35 pm

mhull wrote:
hgm wrote:Computer code cannot be protected technically from reverse engineering. Decompilation / disassembly can produce a source code. That source code, however is not the same as THE source code. When programmers write source code, they use names for the memory variables that are helpful and indicative of the function. Like CastlingRights, OpenFileBonus, PawnIsPasser[n], rather than ByteVar472, IntVar989 and Array12[n]. And the might annotate it with 'comments', i.e.help-ful remarks that remind them what is the function of a section of code. That makes reading the original source like reading a novel, to an experienced programmer. The decompilation cannot recover that. It requires a human intelligence to painstakingly follow up the instructions of the program noting how it changes the value of variables in response to moves (after having figured out what variables constitute the board), like being set to zero when the King or Rook moves, or being added to the score only when a Rook is on an open file, etc.

This is a lot of work, but it can always be done. It is just a question of how much effort you are prepared to put in it, and that depends on importance and alteratives. It could also be your hobby. Some people like to solve sudokus, others like to disassemble source code.

Hence the only way to protect computer code is to lock it into your vault, and be sure no one ever has access to it. As soon as they can get their hands on an executable, there is no protection, other than that it takes time and effort. The only protection is legal protection, in terms of copyrights andpatents. It is like locking your car when you leave it parked alongside a stony road. You know that everyone can get in by simply picking up a stone and smashing the window.
Suppose a software cracker were to disassemble a chess program only to discover no chess program at all? Would he be able to detect that he was using a virtual machine parsing opcodes from an alien (or even notional) machine architecture known only to its creator?

In this way, a private (or commercial) chess program might be distributed with a greatly reduced risk of re-engineering, although it would mean a performance hit for end users. However, this would ensure that only the programmer himself could compile his code onto bare metal for the purpose of entering competitions and chess servers at full performance levels, guaranteeing a perpetual advantage over primitive cloners (and poseuers) at said competitions and chess servers. It would also preserve a proper "feudal spirit" between programmer and devotee.

You've got to distribute enough to make the thing run. Would one try to RE something like a Java Virtual Machine (if it were a secret) so that they could then RE something that runs inside that? If it were important enough, no doubt it would happen.

mar · Post by **mar** » Sat Sep 03, 2011 11:23 pm

hgm wrote:In fact this is exactly what people would find if they disassembled the binary of my engine Usurpator II. They would find an emulator, and if they are smart, they would actually recognize it as a 6502 emulator.

Hey! I once wrote 6502 emulator in 386 assembler. In fact it was a never-released Atari800 emulator (I was able to run it under DosBox so the 6502 progs were emulated twice; very slow btw). I bet I would still remember most 6502 instructions. I always envied Z80 users. Is usurpator free? I'd like to see it action!

But back to topic. I agree with everything you wrote. There is NO way to prevent disassembling, one can only delay it. Plus aggressive protection means annoyance to end users. I remember StarForce (used to protect Video games where piracy is a major issue) was resetting the comp when it detected Nero accessing "protected" dirs or something like that. It ran in the background as a hidden driver, slowing everything down. So basically a virus (or at least a very intrusive), silently ignored by AV programs. Of course pirated/cracked versions had no such problems

mhull · Post by **mhull** » Sat Sep 03, 2011 11:29 pm

bob wrote: You've got to distribute enough to make the thing run. Would one try to RE something like a Java Virtual Machine (if it were a secret) so that they could then RE something that runs inside that? If it were important enough, no doubt it would happen.

Supposing crafty were a private engine: Cross compile in gcc to an s/390 (but tell no one). Then distribute the program with an s/390 emulator to process your s/390 executable. Somehow wrap this all up in a single executable. Again, reveal this to no one.

1) How quickly could someone figure out what was going on?
2) After that, how quickly could they figure out it was s/390?
3) After that, how quickly could someone get up to speed and aquire either free or non-free tools on s/390 disassembly (assuming no initial familiarity)?

Sounds vastly more difficult than just cracking a native executable.

bob · Post by **bob** » Sun Sep 04, 2011 6:22 am

mhull wrote:
bob wrote: You've got to distribute enough to make the thing run. Would one try to RE something like a Java Virtual Machine (if it were a secret) so that they could then RE something that runs inside that? If it were important enough, no doubt it would happen.
Supposing crafty were a private engine: Cross compile in gcc to an s/390 (but tell no one). Then distribute the program with an s/390 emulator to process your s/390 executable. Somehow wrap this all up in a single executable. Again, reveal this to no one.

1) How quickly could someone figure out what was going on?
2) After that, how quickly could they figure out it was s/390?
3) After that, how quickly could someone get up to speed and aquire either free or non-free tools on s/390 disassembly (assuming no initial familiarity)?

Sounds vastly more difficult than just cracking a native executable.

Certainly harder. No argument. Big loss of performance, of course, as that emulator will probably be at LEAST a 10x performance hit, since the instruction set architectures (between S/390 and X86) are so different.

How much harder it would be is a tough one to call, but would it be worth that huge performance loss???

hgm · Post by **hgm** » Sun Sep 04, 2011 9:15 am

mar wrote:Is usurpator free? I'd like to see it action!

http://home.hccnet.nl/h.g.muller/usurpatorIIemu.exe

It runs as a WinBoard engine, but it ignores time control. To tune its average thinking time, it needs a numeric parameter on the command line (which is the full-width search depth in quarter ply). The the ChessWar promo division (40 moves/20 min) it has been running with this parameter set to 14. In the old days, on a real 6502, I used 12.

mar · Post by **mar** » Sun Sep 04, 2011 9:25 am

hgm wrote:
mar wrote:Is usurpator free? I'd like to see it action!
http://home.hccnet.nl/h.g.muller/usurpatorIIemu.exe

It runs as a WinBoard engine, but it ignores time control. To tune its average thinking time, it needs a numeric parameter on the command line (which is the full-width search depth in quarter ply). The the ChessWar promo division (40 moves/20 min) it has been running with this parameter set to 14. In the old days, on a real 6502, I used 12.

Thanks. Runs fine!

mhull · Post by **mhull** » Sun Sep 04, 2011 10:41 pm

mhull wrote:
bob wrote:Same problem for a binary. A good CS person can take a binary and convert it to a working C program. In fact, there is software to do this (after a fashion) already. Not a thing that can be done, because at some point, the engine has to load into RAM in a normal format that will execute. And once that is done, a debugger can expose the entire thing, granted it is in assembly language that is very hard to read/understand without any symbols or procedure names, but that is a hindrance, not a barrier.
There might be a way to conceal the evaluation, at least for the purposes of a hostile person getting a leg up in development or competing under a false flag. Wherever a full eval would normally be done, substitute a trained neural network (NN). You might need to train x number of NNs for different numbers of extant piece combinations. But the advantage is you can't really reverse engineer an NN to find the real eval architecture.

If this would represent a slow down, you wouldn't necessarily care since this would be a distrubution strategy for private and commercial purposes. For competitions and chess servers, you would naturally reserve the fast version for yourself. In this way, end users can never effectively meet you in a competition with a copy of your own program.

This would conceal the evaluation architecture, which might be where a program spends most of its time. If that's the case, the search portion might be emulated where the NN evaluation runs native. This might not be as slow as the entire program running in hardware emulation.

Fore One Smarter Than Me- Don't Get Trampled In A Stampede!

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe

Re: Fore One Smarter Than Me- Don't Get Trampled In A Stampe