PV Fingerprinting

jwes · Post by **jwes** » Mon Jan 25, 2010 9:43 pm

Dann Corbit wrote:
benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.

Start with any random set of mid and end game FENs.

For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.

Then, note matches/nonmatches for every ply.

Then, do the statistics.

After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.

Then, just average the results for each ply.

This could be generated in minutes. You would have averages based on say a 100 different positions.

Then, the debate would be how much of a deviation must there be without being considered a clone.

Or would the PV Fingerprinter be useless?
Take the WAC test suite, and unleash 10 top engines on it at 5 minutes per position.

You will get an exact pv match at least 90% of the time out to the analyzed depth or hash table cut-off (often, there are noise nodes pasted on the end from quiescence or some such, but these don't really count).

What did that tell you?

A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.

Try the same with the STS suite and you will get a very different result. I think this might work well for finding clones (exact functional copies of other programs) and may well give information about derived works.

benstoker · Post by **benstoker** » Mon Jan 25, 2010 9:55 pm

jwes wrote:
Dann Corbit wrote:
benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.

Start with any random set of mid and end game FENs.

For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.

Then, note matches/nonmatches for every ply.

Then, do the statistics.

After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.

Then, just average the results for each ply.

This could be generated in minutes. You would have averages based on say a 100 different positions.

Then, the debate would be how much of a deviation must there be without being considered a clone.

Or would the PV Fingerprinter be useless?
Take the WAC test suite, and unleash 10 top engines on it at 5 minutes per position.

You will get an exact pv match at least 90% of the time out to the analyzed depth or hash table cut-off (often, there are noise nodes pasted on the end from quiescence or some such, but these don't really count).

What did that tell you?

A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.
Try the same with the STS suite and you will get a very different result. I think this might work well for finding clones (exact functional copies of other programs) and may well give information about derived works.

A report could look like this:

Code: Select all

Positions:  250
Search Fixed Depth: 3
Time Cut-Off: 2 seconds
Engine 1: Rybka 11.2
Engine 2: RobboLito 99999999.01023.21324323.344438787384xvbr
===============================
Percentage Matched (Depth Limited)
PV ply 1:  98%  [= #Matches / Total positions]
PV ply 2: ...
PV ply 4: ...
PV ply 5: ...
PV ply 6: ...
PV ply 7: ...
PV ply 8: ...
--------------
Total Depth Limited Average Matches through x ply: ___

===============================
[Repeat for Time Limited]

===============================
[Comparisons to known control studies]

benstoker · Post by **benstoker** » Mon Jan 25, 2010 9:58 pm

jwes wrote:
Dann Corbit wrote:
benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.

Start with any random set of mid and end game FENs.

For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.

Then, note matches/nonmatches for every ply.

Then, do the statistics.

After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.

Then, just average the results for each ply.

This could be generated in minutes. You would have averages based on say a 100 different positions.

Then, the debate would be how much of a deviation must there be without being considered a clone.

Or would the PV Fingerprinter be useless?
Take the WAC test suite, and unleash 10 top engines on it at 5 minutes per position.

You will get an exact pv match at least 90% of the time out to the analyzed depth or hash table cut-off (often, there are noise nodes pasted on the end from quiescence or some such, but these don't really count).

What did that tell you?

A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.
Try the same with the STS suite and you will get a very different result. I think this might work well for finding clones (exact functional copies of other programs) and may well give information about derived works.

Another way to look at it -- serial killers always seem to follow a pattern and that's how they get got.

rjgibert · Post by **rjgibert** » Mon Jan 25, 2010 10:07 pm

Dann Corbit wrote:A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.

Exactly. And if they don't agree, then the small changes made to a program will fool the "fingerprinter" into thinking the cloned program is not a clone. either way, you either get false positives or false negatives.

IMO, it is easier for the programmer himself to defeat the cloner. In effect, he can incorporate a secret way of asking a program, "Are you crafty?" and have it reply, "Yes, Dr. Hyatt!" I already have an idea of how specifically this can be done, but it can be defeated if how it is done is advertised publicly. I don't think this "downside" is bad enough to make it impractical, however. Besides, I think some type of downside is unavoidable.

metax · Post by **metax** » Mon Jan 25, 2010 11:44 pm

rjgibert wrote:IMO, it is easier for the programmer himself to defeat the cloner. In effect, he can incorporate a secret way of asking a program, "Are you crafty?" and have it reply, "Yes, Dr. Hyatt!" I already have an idea of how specifically this can be done, but it can be defeated if how it is done is advertised publicly. I don't think this "downside" is bad enough to make it impractical, however. Besides, I think some type of downside is unavoidable.

But the cloner will probably notice this while reverse-engineering the code and remove it.
When a clone appears, you have to reveal the secret and future clones will remove that feature before releasing the clone.

rjgibert · Post by **rjgibert** » Tue Jan 26, 2010 5:48 am

metax wrote:
rjgibert wrote:IMO, it is easier for the programmer himself to defeat the cloner. In effect, he can incorporate a secret way of asking a program, "Are you crafty?" and have it reply, "Yes, Dr. Hyatt!" I already have an idea of how specifically this can be done, but it can be defeated if how it is done is advertised publicly. I don't think this "downside" is bad enough to make it impractical, however. Besides, I think some type of downside is unavoidable.
But the cloner will probably notice this while reverse-engineering the code and remove it.
When a clone appears, you have to reveal the secret and future clones will remove that feature before releasing the clone.

The cloner won't notice, because:

1. They don't know it is there. Why advertise that you've protected yourself against cloners?
2. Even if they think to look for it, they won't know where to look, because it does not require any special code i.e. it can be melded together with ordinary useful code, so it will not give itself away.

#2 above might sound paradoxical, but it is not. There are certain types of algorithms that are useful to a program, but that can have embedded in them information that is unique to a certain program without altering its behavior during normal operation. I won't go into any more detail, since I don't want to make it any easier to defeat the method I have in mind. It can be defeated, but it is a lot of trouble to do so.

Using the idea to nail one cloner does not require releasing the secret. And even if it did, you can set more than one trap. Each trap would be a different secret. After nailing 2 cloners with 2 different secrets, any other prospective cloner will steer clear, because they don't know if there is a 3rd trap and finding it will be particularly difficult when there are no more traps and they don't know that

Avoiding all that traps would require so much effort that it would be easier to simply write an independent or at least mostly independent program.

Michel · Post by **Michel** » Tue Jan 26, 2010 9:23 am

1. They don't know it is there. Why advertise that you've protected yourself against cloners?
2. Even if they think to look for it, they won't know where to look, because it does not require any special code i.e. it can be melded together with ordinary useful code, so it will not give itself away.

Google for security through obscurity!

rjgibert · Post by **rjgibert** » Tue Jan 26, 2010 10:20 am

Michel wrote:
1. They don't know it is there. Why advertise that you've protected yourself against cloners?
2. Even if they think to look for it, they won't know where to look, because it does not require any special code i.e. it can be melded together with ordinary useful code, so it will not give itself away.
Google for security through obscurity!

I understand what you mean, but I have always found the concept a bit specious. It seems to argue for making our secret passwords public and that programmers may as well publish the source code for the applications they write. Good luck with that. Maintaining security without keeping at least some secrets is problematic.

In the case, of cloning, you can't prevent cloners from decompiling your program so that it can be examined. This makes it tough. If you have a solution that's bullet proof, I'd like to hear it.

Dann Corbit · Post by **Dann Corbit** » Tue Jan 26, 2010 8:56 pm

rjgibert wrote:
Michel wrote:
1. They don't know it is there. Why advertise that you've protected yourself against cloners?
2. Even if they think to look for it, they won't know where to look, because it does not require any special code i.e. it can be melded together with ordinary useful code, so it will not give itself away.
Google for security through obscurity!
I understand what you mean, but I have always found the concept a bit specious. It seems to argue for making our secret passwords public and that programmers may as well publish the source code for the applications they write. Good luck with that. Maintaining security without keeping at least some secrets is problematic.

In the case, of cloning, you can't prevent cloners from decompiling your program so that it can be examined. This makes it tough. If you have a solution that's bullet proof, I'd like to hear it.

Open source, like Crafty, Fruit, and Glaurung.

OK, OK, a little tongue in cheek. But not entirely.

jwes · Post by **jwes** » Tue Jan 26, 2010 11:22 pm

rjgibert wrote:
Michel wrote:
1. They don't know it is there. Why advertise that you've protected yourself against cloners?
2. Even if they think to look for it, they won't know where to look, because it does not require any special code i.e. it can be melded together with ordinary useful code, so it will not give itself away.
Google for security through obscurity!
I understand what you mean, but I have always found the concept a bit specious. It seems to argue for making our secret passwords public and that programmers may as well publish the source code for the applications they write. Good luck with that. Maintaining security without keeping at least some secrets is problematic.

In the case, of cloning, you can't prevent cloners from decompiling your program so that it can be examined. This makes it tough. If you have a solution that's bullet proof, I'd like to hear it.

The problem with "security through obscurity" is that it keeps people from taking more effective security measures. No one is saying to make them public, just to assume they will become public and try somehow to make it secure anyway.

PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting

Re: PV Fingerprinting