When will we see HOUDINI in official tournaments?

Rebel · Post by **Rebel** » Sat May 05, 2012 5:53 pm

BubbaTough wrote:
Rebel wrote:If an engine gives just one 60% hit with any other engine that engine is not allowed to play.
I would assume if two engines have a 60+% match, then the first one of them that was published would still be allowed to play. It makes no sense for a derivative to force the original to be banned. A few other amendments that would make sense is if the match is with a program by the same author that is permissible, or if the first published engine gives permission to the other engine to play all is good.

It does sounds like if the 60% rule is actually in place, that Critter would have to get Houdini's permission to play, as Houdini1.5 was published before Critter1.4 I believe (unless as Graham said the version entered is not 1.4, and does fail the criteria). I haven't seen any indication that it is in place though. I kind of like the rule myself in that it is very concrete, authors can test it themselves, and it does not require sharing precious source code. Perhaps I wouldn't call it clone detection...some name like the "sufficient diversity" rule might carry less stigma. I would guess some of the Houdini disdain for the technique is from the close match of RobboLito 0.085d1 and Houdini 1.00, which carries massive negative implications in this community. If such matches did not imply lifelong disdain for the author and all his future work, and instead just implied the author had some work to do before he could enter tournaments, it might not receive such strong negative reactions from certain authors. After all, its widely acknowledged that a strong match does not ensure wrongdoing has been done. But it also makes sense that tournaments would want entrants that do not always pick the same moves.

-Sam

+1

Although it's better for Richard to stay below the 60%. But in the end it's up to the CSVN.

Albert Silver · Post by **Albert Silver** » Sat May 05, 2012 6:02 pm

Rebel wrote:
BubbaTough wrote:
Rebel wrote:If an engine gives just one 60% hit with any other engine that engine is not allowed to play.
I would assume if two engines have a 60+% match, then the first one of them that was published would still be allowed to play. It makes no sense for a derivative to force the original to be banned. A few other amendments that would make sense is if the match is with a program by the same author that is permissible, or if the first published engine gives permission to the other engine to play all is good.

It does sounds like if the 60% rule is actually in place, that Critter would have to get Houdini's permission to play, as Houdini1.5 was published before Critter1.4 I believe (unless as Graham said the version entered is not 1.4, and does fail the criteria). I haven't seen any indication that it is in place though. I kind of like the rule myself in that it is very concrete, authors can test it themselves, and it does not require sharing precious source code. Perhaps I wouldn't call it clone detection...some name like the "sufficient diversity" rule might carry less stigma. I would guess some of the Houdini disdain for the technique is from the close match of RobboLito 0.085d1 and Houdini 1.00, which carries massive negative implications in this community. If such matches did not imply lifelong disdain for the author and all his future work, and instead just implied the author had some work to do before he could enter tournaments, it might not receive such strong negative reactions from certain authors. After all, its widely acknowledged that a strong match does not ensure wrongdoing has been done. But it also makes sense that tournaments would want entrants that do not always pick the same moves.

-Sam
+1

Although it's better for Richard to stay below the 60%. But in the end it's up to the CSVN.

Question: what about the derivative of a derivative?

Let's suppose Rybka is 60% Fruit, so it is banned. Ippo is 40% Fruit, but 60% Rybka. Ippo is therefore banned because it is a declared derivative of Rybka (which was also banned by this rule). Then Houdini is 60% Ippo and is banned, though 40% Rybka and 20% Fruit. The along comes ScoobyDoo which is 60% Houdini, though 20% Rybka and 1% Fruit. It is also banned.

Of course all this might be moot as they find ways to include code that render this percentage stuff completely useless (which I doubt is that hard frankly).

Rebel · Post by **Rebel** » Sat May 05, 2012 6:08 pm

Uri Blass wrote:I can add that I wonder if it is possible to get a list of original and non original program based on the 60%

Note that I can imagine the following case.

engine X.1 is original

someone copy engine X.1 but is careful to do enough changes to get similiarity of 59.9% and call his engine Y

Later the author of engine X make a small change and has version X.2
that he want to participate with it in a tournament.

After comparing X.2 with Y we find similiarity of 60.1% so people tell the author of engine X.1 that he is not allowed to participate in a tournament with X.2

I do not think that it is fair.

Sam already addressed this. X1 has the older origin.

Rebel · Post by **Rebel** » Sat May 05, 2012 6:20 pm

Harvey Williamson wrote: Only if all engines are supplied to the CSVN. How can they possibly test Rybka Cluster? Although the rules say the exe should be attached to the entry form?! This is a ridiculous request as many work on the engine right up to the start and during the tournament.

And why should that be a problem?

In theory program X secretly can play with program Y a few rounds that is 100-200 elo stronger. How can you be sure this never happened? Some basic trust should remain and not be sacrificed. Or did you check every executable in Tilburg before each round?

IWB · Post by **IWB** » Sat May 05, 2012 6:55 pm

Hello Ed,

Rebel wrote: I have the feeling you haven't got the point of the system.
...
Robbolito is linked to Ippolit from an unknown author, end of story.

Rybka then.
Code: Select all
  1) Fruit 2.1      (time: 100 ms  scale: 1.0)
  2) Rybka 1        (time: 100 ms  scale: 1.0)
  3) Rybka 2.3.2a   (time: 100 ms  scale: 1.0)
  4) Rybka 3        (time: 100 ms  scale: 1.0)
  5) Rybka 4        (time: 100 ms  scale: 1.0)

         1     2     3     4     5
  1.  ----- 54.43 52.63 47.83 47.58
  2.  54.43 ----- 61.92 52.86 52.68
  3.  52.63 61.92 ----- 57.71 56.26
  4.  47.83 52.86 57.71 ----- 59.31
  5.  47.58 52.68 56.26 59.31 -----
None of the Rybka versions show a 60% similarity with Fruit. Rybka 3 and 4 with 47% are above all suspicion.

And I think you dont get my point.

1. You have a rule which you cant enforce if you allow remote engines (and to double the problem with a cluster)

2. I miss the comparison of R4/R4.1 to the Robos (especialy R0.85 to R0.9). If there is a similarity of 0.6 what are you doing? (and if it is lower than 0.6 there is still Point1 - can someone do this test please?)

3. I assume you go to the point that the actual playing version might be different (as you do with Critter) ... but that is TRUE for ALL participants which you cant control (see point 1).

Again, as it is it is a LEX-Rybka as you do not compare Rybka to the Robbos AND you cant control Rybka at all. Therefore it is aboslutly useless!

Bye
Ingo

michiguel · Post by **michiguel** » Sat May 05, 2012 7:06 pm

IWB wrote:Hello Ed,
Rebel wrote: I have the feeling you haven't got the point of the system.
...
Robbolito is linked to Ippolit from an unknown author, end of story.

Rybka then.
Code: Select all
  1) Fruit 2.1      (time: 100 ms  scale: 1.0)
  2) Rybka 1        (time: 100 ms  scale: 1.0)
  3) Rybka 2.3.2a   (time: 100 ms  scale: 1.0)
  4) Rybka 3        (time: 100 ms  scale: 1.0)
  5) Rybka 4        (time: 100 ms  scale: 1.0)

         1     2     3     4     5
  1.  ----- 54.43 52.63 47.83 47.58
  2.  54.43 ----- 61.92 52.86 52.68
  3.  52.63 61.92 ----- 57.71 56.26
  4.  47.83 52.86 57.71 ----- 59.31
  5.  47.58 52.68 56.26 59.31 -----
None of the Rybka versions show a 60% similarity with Fruit. Rybka 3 and 4 with 47% are above all suspicion.
And I think you dont get my point.

1. You have a rule which you cant enforce if you allow remote engines (and to double the problem with a cluster)

2. I miss the comparison of R4/R4.1 to the Robos (especialy R0.85 to R0.9). If there is a similarity of 0.6 what are you doing? (and if it is lower than 0.6 there is still Point1 - can someone do this test please?)

3. I assume you go to the point that the actual playing version might be different (as you do with Critter) ... but that is TRUE for ALL participants which you cant control (see point 1).

Again, as it is it is a LEX-Rybka as you do not compare Rybka to the Robbos AND you cant control Rybka at all. Therefore it is aboslutly useless!

Bye
Ingo

In theory, a remote cluster could be checked better than before. If it behaves like a WB or UCI engine, you can connect it to a local computer. There, you can run the tests and later, that same computer could be used to play.

I am not advocating for or against this method, but I do not believe that the presence of clusters is a valid criticism in theory, particularly when it could hardly be worse than the current situation.

Miguel

Harvey Williamson · Post by **Harvey Williamson** » Sat May 05, 2012 7:10 pm

IWB wrote:Hello Ed,
Rebel wrote: I have the feeling you haven't got the point of the system.
...
Robbolito is linked to Ippolit from an unknown author, end of story.

Rybka then.
Code: Select all
  1) Fruit 2.1      (time: 100 ms  scale: 1.0)
  2) Rybka 1        (time: 100 ms  scale: 1.0)
  3) Rybka 2.3.2a   (time: 100 ms  scale: 1.0)
  4) Rybka 3        (time: 100 ms  scale: 1.0)
  5) Rybka 4        (time: 100 ms  scale: 1.0)

         1     2     3     4     5
  1.  ----- 54.43 52.63 47.83 47.58
  2.  54.43 ----- 61.92 52.86 52.68
  3.  52.63 61.92 ----- 57.71 56.26
  4.  47.83 52.86 57.71 ----- 59.31
  5.  47.58 52.68 56.26 59.31 -----
None of the Rybka versions show a 60% similarity with Fruit. Rybka 3 and 4 with 47% are above all suspicion.
And I think you dont get my point.

1. You have a rule which you cant enforce if you allow remote engines (and to double the problem with a cluster)

2. I miss the comparison of R4/R4.1 to the Robos (especialy R0.85 to R0.9). If there is a similarity of 0.6 what are you doing? (and if it is lower than 0.6 there is still Point1 - can someone do this test please?)

3. I assume you go to the point that the actual playing version might be different (as you do with Critter) ... but that is TRUE for ALL participants which you cant control (see point 1).

Again, as it is it is a LEX-Rybka as you do not compare Rybka to the Robbos AND you cant control Rybka at all. Therefore it is aboslutly useless!

Bye
Ingo

Good post Ingo it raises some real practical difficulties with the CSVN new rule that says all exes should be attached to the entry. I bet nobody attached an exe to the entry. For good reasons as the exe that competes is not ready till a few hours before the tournament and can change during it. I also have no idea how they will run something like Rybka cluster through their new test.

IWB · Post by **IWB** » Sat May 05, 2012 7:14 pm

michiguel wrote:
In theory, a remote cluster could be checked better than before. If it behaves like a WB or UCI engine, you can connect it to a local computer. There, you can run the tests and later, that same computer could be used to play.

I am not advocating for or against this method, but I do not believe that the presence of clusters is a valid criticism in theory, particularly when it could hardly be worse than the current situation.

Miguel

You named it:in theory!

1. Do they get free access to all systems? Local and especialy remote.
2. No one can control what exe is running on the remote engine or how it might be configured.
(3. Speculative: How does a cluster perform in such a (short time) test. No statistical data to compare is available.)

I am VERY sceptic about all this.

Does anyone have the comparision of R4/4.1 to the Robos?

Bye
Ingo

LudiBuda · Post by **LudiBuda** » Sat May 05, 2012 7:16 pm

You forgot to read the last rule:
- Rybka is above all rules.

Similarity test is laughable. It's actually worse (doing more damage to CC) then doing nothing.
The only true check is disassembling the engines. You don't need more then day or two in most cases to see if there is something suspicious.

Uri Blass · Post by **Uri Blass** » Sat May 05, 2012 7:16 pm

IWB wrote:Hello Ed,
Rebel wrote: I have the feeling you haven't got the point of the system.
...
Robbolito is linked to Ippolit from an unknown author, end of story.

Rybka then.
Code: Select all
  1) Fruit 2.1      (time: 100 ms  scale: 1.0)
  2) Rybka 1        (time: 100 ms  scale: 1.0)
  3) Rybka 2.3.2a   (time: 100 ms  scale: 1.0)
  4) Rybka 3        (time: 100 ms  scale: 1.0)
  5) Rybka 4        (time: 100 ms  scale: 1.0)

         1     2     3     4     5
  1.  ----- 54.43 52.63 47.83 47.58
  2.  54.43 ----- 61.92 52.86 52.68
  3.  52.63 61.92 ----- 57.71 56.26
  4.  47.83 52.86 57.71 ----- 59.31
  5.  47.58 52.68 56.26 59.31 -----
None of the Rybka versions show a 60% similarity with Fruit. Rybka 3 and 4 with 47% are above all suspicion.
And I think you dont get my point.

1. You have a rule which you cant enforce if you allow remote engines (and to double the problem with a cluster)

2. I miss the comparison of R4/R4.1 to the Robos (especialy R0.85 to R0.9). If there is a similarity of 0.6 what are you doing? (and if it is lower than 0.6 there is still Point1 - can someone do this test please?)

3. I assume you go to the point that the actual playing version might be different (as you do with Critter) ... but that is TRUE for ALL participants which you cant control (see point 1).

Again, as it is it is a LEX-Rybka as you do not compare Rybka to the Robbos AND you cant control Rybka at all. Therefore it is aboslutly useless!

Bye
Ingo

for point 2 I think that if there is a similiarity of at least 0.6 it means that Robos is not accepted because Rybka3 was before the Robos and I assume that it is possible to show more than 0.6 similiarity between Rybka3 and Rybka4.1.

When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?

Re: When will we see HOUDINI in official tournaments?