Critter 1.2 SEEMS to be a member of the Ippo family

Don · Post by **Don** » Fri Aug 24, 2012 12:04 am

Laskos wrote:
Don wrote:
Laskos wrote:
noctiferus wrote:e-mail would be fine.
Thx again: very kind of you!
Ok, send me a PM with your e-mail address, I will send you the "similarity.data" file (~400kB) in which all the ~8000 moves in algebraic notation are chosen by each engine, you can perform bootstrapping, for example, or everything you want. This file can be opened in notepad to see what's there, I never used it for bootstrapping, but Miguel did it, you better ask him how to deal with this file.

Kai
Here is my take on the sim tool.

It was designed to measure the similarity in move choice between 2 different programs. That's a very simple concept and all it does is simply counting. It gets a lot more complicated if you make assumptions about what it is supposed to measure.

The issue of piece square tables came up. The test was not designed to measure piece square tables, only move choice. If your program is heavily influence by piece square tables then I would expect that to make a big difference.

If your program COPIES piece square tables from other programs, then you should expect to get more similarity because your program has plagiarized elements from that other chess program. That should not be a surprise to anyone. I would further expect that if you copy even more evaluation concept and exact values for them from other programs you will likely get more move choices that are the same. I don't believe this is rocket science.
Now, that yo have debunked the PST tale of Richard Vida, do you agree that to make your program behave differently under Sim, you have to change much more than PST?

I would agree absolutely about this.

I do not believe the search has much impact on the results at all because you can run the same program at widely disparate depths and get extremely high similarity. So it's my hypothesis that search has a very small (but not zero) impact on the similarity.
Search does have some impact, for a factor x100 in time the similarity may shift by 5-7%, which is not a very small variation, when all the range from unrelated to very related is some 20%. Therefore CSVN, when setting 60% limit, must set the time control too, say 100ms on one modern core.

I agree, search does have some impact but it takes an order of magnitude more time to have much of an impact. I have seen people use the tool by carefully adjusting the scale factor to equalize the ELO ratings of the programs they are testing and I believe none of that is necessary. If you are trying to "catch" a suspected plagiarist chance are "his" program is in the same general class as the program you are comparing to anyway.

If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.

I never claimed the tool proves that a program was cloned and in fact I have always made that disclaimer, so anyone critical of it on that basis is the one making claims and jumping to conclusions. The very first version of the tool was advertised as a clone detector in this forum for the impact and sensationalism to get everyone's attention but even in that first post I made the disclaimer. Several times since then I wrote that we need to gain more experience with it in order to understand how it works.

So is it a good clone detection utility? No. Naum got very high correlation with Rybka and it turns out the reason was that they used automated tuning methods to MAKE it play like Rykba - evidently they were successful. I don't think anyone believes the Naum programmer plagiarized Rybka.
Do you still believe that? Could you tune Komodo whatever methods to play like Rybka or even Shredder? Must be a humongous task. I believe it's another myth, the similarity of Naum (and of Fritz, if I remember) to Rybka (Strelka) was unusually high.

I don't really know anything about Naum. But I have never heard anyone complain about it so I assume it is clean. There should be pretty convincing evidence before going after someone over something like this.

I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.

I suggested one possible use of it long ago - as a tool to clear programs. I am personally more comfortable using it to clear people than to convict them and put this in the same class as a polygraph test, that it should be used as an investigation tool only - not admissible as proof of clonesmanship.

It appears that it does what it does pretty effectively however. Richard Vida says he copied the piece square tables and tool picked this up.

Don
I think that the tool as used by CSVN could be well applied to erase suspicions in tournaments. If some do not like it, it can even be portrayed as a tool to measure "diversity" of engines (nobody can deny that it does measure diversity), and that the tournaments need high "diversity" of engines, while we know what it means with the Sim tester, and what kind of suspicions there really are.

Kai

Laskos · Post by **Laskos** » Fri Aug 24, 2012 12:30 am

Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.

Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).

I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.

I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai

Don · Post by **Don** » Fri Aug 24, 2012 12:37 am

Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai

You might be right. I can imagine that if your program had few evaluation features, it would be hard to tune it a certain way. But if you had a LOT of evaluation features it would be easier to mimic some other program.

Of course actually doing the tuning itself can be a very difficult problem, something that would have to be done with a technique that is capable of exploring a large search space such as simulated annealing or GA.

Laskos · Post by **Laskos** » Fri Aug 24, 2012 12:51 am

Don wrote:
Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai
You might be right. I can imagine that if your program had few evaluation features, it would be hard to tune it a certain way. But if you had a LOT of evaluation features it would be easier to mimic some other program.

Of course actually doing the tuning itself can be a very difficult problem, something that would have to be done with a technique that is capable of exploring a large search space such as simulated annealing or GA.

Yes, that's why I am saying that tuning to a strong program in order to gain strength seems unrealistic, when the other program probably has a comparable (large) set of features acting in an unknown way. One maybe could tune Komodo to mMax, though.

Kai

Rebel · Post by **Rebel** » Fri Aug 24, 2012 5:35 pm

Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai

It's very hard to change the playing style of a program. As an experiment someone took Fruit 2.1 and modified the WHOLE evaluation to the EXACT Rybka 1.0 values. And similarity only reported a 4% increase. Imagine that.

bob · Post by **bob** » Fri Aug 24, 2012 5:54 pm

Rebel wrote:
Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai
It's very hard to change the playing style of a program. As an experiment someone took Fruit 2.1 and modified the WHOLE evaluation to the EXACT Rybka 1.0 values. And similarity only reported a 4% increase. Imagine that.

I think this is starting at the WRONG end.

If you copy A and create a program "B", and you modify the eval extensively, B won't look anything like A. But, if you keep the same eval, and modify the search extensively, it also won't look anything like A. Bottom line: if the two searches are significantly different, then using the same eval won't produce the same moves. Ditto for two different evals but the same search.

Both contribute.

Trying to just copy the eval to produce a rybka-like result isn't going to happen. Nor will just copying the search.

But one can change the playing style of a program with just a few eval or search changes, but it is not easy to make a program look like another doing that. But it will definitely make the program look different from its original version.

Rebel · Post by **Rebel** » Fri Aug 24, 2012 9:31 pm

bob wrote:
Rebel wrote:
Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai
It's very hard to change the playing style of a program. As an experiment someone took Fruit 2.1 and modified the WHOLE evaluation to the EXACT Rybka 1.0 values. And similarity only reported a 4% increase. Imagine that.
I think this is starting at the WRONG end.

If you copy A and create a program "B", and you modify the eval extensively, B won't look anything like A.

But, if you keep the same eval, and modify the search extensively, it also won't look anything like A.

We have been here before. Remove null-move and LMR and the similarity will remain. Why don't you do experiments yourself first? Some of us did.

Bottom line: if the two searches are significantly different, then using the same eval won't produce the same moves. Ditto for two different evals but the same search.

Both contribute.

Trying to just copy the eval to produce a rybka-like result isn't going to happen. Nor will just copying the search.

But one can change the playing style of a program with just a few eval or search changes, but it is not easy to make a program look like another doing that. But it will definitely make the program look different from its original version.

Don · Post by **Don** » Fri Aug 24, 2012 9:39 pm

Rebel wrote:
bob wrote:
Rebel wrote:
Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai
It's very hard to change the playing style of a program. As an experiment someone took Fruit 2.1 and modified the WHOLE evaluation to the EXACT Rybka 1.0 values. And similarity only reported a 4% increase. Imagine that.
I think this is starting at the WRONG end.

If you copy A and create a program "B", and you modify the eval extensively, B won't look anything like A.

But, if you keep the same eval, and modify the search extensively, it also won't look anything like A.
We have been here before. Remove null-move and LMR and the similarity will remain. Why don't you do experiments yourself first? Some of us did.

Bottom line: if the two searches are significantly different, then using the same eval won't produce the same moves. Ditto for two different evals but the same search.

Both contribute.

Trying to just copy the eval to produce a rybka-like result isn't going to happen. Nor will just copying the search.

But one can change the playing style of a program with just a few eval or search changes, but it is not easy to make a program look like another doing that. But it will definitely make the program look different from its original version.

What does Bob mean when he says they will "look different?"

The context of this dicussion is how they would score on the similarity tool. I have to agree with Ed on this, even if I drastically changed Komodo's search it would still score very similar on the tester. I could modify it to the extent that it was 200 ELO weaker and it would STILL "look the same" from the standpoint of style.

michiguel · Post by **michiguel** » Fri Aug 24, 2012 9:57 pm

Don wrote:
Rebel wrote:
bob wrote:
Rebel wrote:
Laskos wrote:
Don wrote:
If you want to make that adjustment and you know the ratings a good rough rule of thumb is good enough.
Yes, up to a factor of 10 (~200 Elos in ratings) the time control is pretty irrelevant, and one could keep it constant (say 100ms).
I would not know how to tune Komodo to match the moves of some target program. I would never want to do this anyway but if I were determined to do so I could probably figure it out.
I think it's impossible. It's hard to tune even for strength. One can maybe tune a much stronger engine to a really dumb one, but it seems not the case here.

Kai
It's very hard to change the playing style of a program. As an experiment someone took Fruit 2.1 and modified the WHOLE evaluation to the EXACT Rybka 1.0 values. And similarity only reported a 4% increase. Imagine that.
I think this is starting at the WRONG end.

If you copy A and create a program "B", and you modify the eval extensively, B won't look anything like A.

But, if you keep the same eval, and modify the search extensively, it also won't look anything like A.
We have been here before. Remove null-move and LMR and the similarity will remain. Why don't you do experiments yourself first? Some of us did.

Bottom line: if the two searches are significantly different, then using the same eval won't produce the same moves. Ditto for two different evals but the same search.

Both contribute.

Trying to just copy the eval to produce a rybka-like result isn't going to happen. Nor will just copying the search.

But one can change the playing style of a program with just a few eval or search changes, but it is not easy to make a program look like another doing that. But it will definitely make the program look different from its original version.
What does Bob mean when he says they will "look different?"

The context of this dicussion is how they would score on the similarity tool. I have to agree with Ed on this, even if I drastically changed Komodo's search it would still score very similar on the tester. I could modify it to the extent that it was 200 ELO weaker and it would STILL "look the same" from the standpoint of style.

There have been extensive experiments done on this by several people, Ed, Kai, Don, Michael (Hart), and particularly Adam who run gazillion engines in many different conditions. Unless someone comes with an experiment that disproves the notion that search it is factor but it _not_ dominant, this is not really useful to debate.

In addition, this is position independent. Same results (with different noise levels, of course) are obtained with different set of positions. In fact, the bootstrapping (jackknife to be precise) experiments I ran show than same results are generally obtained wit subsets of the positions, for the most critical branches.

Miguel

Uri Blass · Post by **Uri Blass** » Fri Aug 24, 2012 10:48 pm

Rebel wrote: We have been here before. Remove null-move and LMR and the similarity will remain. Why don't you do experiments yourself first? Some of us did.

If my target is to change move choice by search change then my first thinking is not to remove null move or LMR but changing the order of moves

The point is that I believe that there are many positions when 2 moves have the same score and the move choice is dependent on the move that the program search first.

Of course you need to search hash move good captures and killers first but after it or when you have no hash move and no killers or good captures(that is usually the case at depth 1) you should use the opposite order of moves if you want to change the move choice.

Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family

Re: Critter 1.2 SEEMS to be a member of the Ippo family