Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
PV Fingerprinting
Moderator: Ras
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: PV Fingerprinting
The idea has possibilities. But it would take some testing and analysis to see how prone it would be to produce "false positives"...benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
-
- Posts: 344
- Joined: Wed Sep 23, 2009 5:56 pm
- Location: Germany
Re: PV Fingerprinting
I just rather think of the false negatives... If you change the evaluation weights of a clone completely or write a new eval, then the moves selected should differ significantly, but the strength will not necessary drop - only the playing style will change. Nevertheless it's still a clone.
-
- Posts: 342
- Joined: Tue Jan 19, 2010 2:05 am
Re: PV Fingerprinting
Maybe it could have other uses also. You have a known control engine, and you want to get a rough idea how close another engine is to the control.bob wrote:The idea has possibilities. But it would take some testing and analysis to see how prone it would be to produce "false positives"...benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
-
- Posts: 20943
- Joined: Mon Feb 27, 2006 7:30 pm
- Location: Birmingham, AL
Re: PV Fingerprinting
That's why you need to carefully choose the positions. For tactical ideas, a search will follow the same path regardless of the evaluation, and that can be recognized when you compare depths, PV moves and material eval at the end.metax wrote:I just rather think of the false negatives... If you change the evaluation weights of a clone completely or write a new eval, then the moves selected should differ significantly, but the strength will not necessary drop - only the playing style will change. Nevertheless it's still a clone.
It's not an easy task, but it is doable. Some also use bizarre positions that produce known problems in some programs as a way of recognizing them. But then some of us fix such problems from time to time and unintentionally render that detection mechanism invalid.
-
- Posts: 317
- Joined: Mon Jun 26, 2006 9:44 am
Re: PV Fingerprinting
This idea is a non-starter. There are a variety simple ways to fool your program. Small changes can have significant changes in behavior.benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
For example, one way for defeating your "fingerprinter" is to simply change the way depth is counted by the program. Throw a couple of other little changes like simply not reporting any information for shallow depths, etc. and this will throw off your "fingerprinter" enough to make it useless.
-
- Posts: 342
- Joined: Tue Jan 19, 2010 2:05 am
Re: PV Fingerprinting
rjgibert wrote:This idea is a non-starter. There are a variety simple ways to fool your program. Small changes can have significant changes in behavior.benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
For example, one way for defeating your "fingerprinter" is to simply change the way depth is counted by the program. Throw a couple of other little changes like simply not reporting any information for shallow depths, etc. and this will throw off your "fingerprinter" enough to make it useless.
Does this mean that one can draw no conclusions from the fact that a program and its alleged clone show very similar pv lines over a large number test positions?
-
- Posts: 12792
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: PV Fingerprinting
Take the WAC test suite, and unleash 10 top engines on it at 5 minutes per position.benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
You will get an exact pv match at least 90% of the time out to the analyzed depth or hash table cut-off (often, there are noise nodes pasted on the end from quiescence or some such, but these don't really count).
What did that tell you?
A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.
-
- Posts: 317
- Joined: Mon Jun 26, 2006 9:44 am
Re: PV Fingerprinting
The idea is this. If a program actually searches to a depth of 13 when asked to search to a depth of 10 and the PV up to depth 10 remains unchanged, then it is unlikely to be any different across many different (strong) programs, because what the PV should be will be too stable. In effect, the extra ply searched simulates a program that searches to depth 10 with a super accurate eval. In effect, a completely different eval.benstoker wrote:rjgibert wrote:This idea is a non-starter. There are a variety simple ways to fool your program. Small changes can have significant changes in behavior.benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
For example, one way for defeating your "fingerprinter" is to simply change the way depth is counted by the program. Throw a couple of other little changes like simply not reporting any information for shallow depths, etc. and this will throw off your "fingerprinter" enough to make it useless.
Does this mean that one can draw no conclusions from the fact that a program and its alleged clone show very similar pv lines over a large number test positions?
You can try to get around this by carefully selecting the test positions, but then your test positions will become known to the cloner. You need to use a randomly generated set that can't be anticipated, but this has problems too.
-
- Posts: 342
- Joined: Tue Jan 19, 2010 2:05 am
Re: PV Fingerprinting
But, to just get a so-called 'fingerprint' why not limit it to 2 ply or 1 second? In fact, the shorter the better, because the further you go out, that's when the good progs will start to converge.Dann Corbit wrote:Take the WAC test suite, and unleash 10 top engines on it at 5 minutes per position.benstoker wrote:Would the following PV fingerprinting method be a reliable method to detect clones at least superficially.
Start with any random set of mid and end game FENs.
For each engine snapshot the bestmove PV at a) fixed depth and again b) at fixed very short time control.
Then, note matches/nonmatches for every ply.
Then, do the statistics.
After several FENs, you'd have something like 99% for the first ply, 95% for second, and so on down to 8,9, or 10 ply.
Then, just average the results for each ply.
This could be generated in minutes. You would have averages based on say a 100 different positions.
Then, the debate would be how much of a deviation must there be without being considered a clone.
Or would the PV Fingerprinter be useless?
You will get an exact pv match at least 90% of the time out to the analyzed depth or hash table cut-off (often, there are noise nodes pasted on the end from quiescence or some such, but these don't really count).
What did that tell you?
A pv is a plan. If the plan is obvious enough, all the engines will say the same thing, if they know what they are doing.
Any way, why do people seem to get something out of looking at matching PVs to detect clones? You seem to also be saying that matching PVs says nothing about whether two progs are similar or are clones.
--Curious