A Simple Experiment for Advancing the Discussion

chrisw · Post by **chrisw** » Sun Aug 31, 2008 9:40 pm

ROTFL!!!

This is NOT about who is the geekiest programmer, Bob!

You will of course write something as totally different to my load of old crappy cobblers because you have to, to prove your point. The reason you were being difficult was that you know perfectly well that Alex's experiment would leave you antis with a bad result.

Nevertheless, I think it is now demonstrated, no matter how crap my code was, how inefficient, how it loses the geek competition, having seen it, and not caring for efficiency, time, memory, Alex's brilliant expression

psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.

advances massively the case that the trivial UCI code comparison is meaningless as a proof of any form of cut 'n paste copying.

bob wrote:
chrisw wrote:Hi Alex,

Well, unlike Bob, who seems to me to be just trying to be difficult, and in the interests of progress, I wrote some crappy code in C, appended below.
How am I trying to be difficult. You are making your usual assumptions. Nowhere did he mention "C". I did not quite catch the "one" vs "1". But here is my code to do the entire process:

"evaluate":
#!/bin/csh
set noglob
set v1 = `echo $1 | awk -f swap`
set v3 = `echo $3 | awk -f swap`
echo `expr $v1 $2 $v3 | awk -f unswap`

swap:
/zero/{print "0"}
/one/{print "1"}
/two/{print "2"}
/three/{print "3"}
/four/{print "4"}
/five/{print "5"}
/six/{print "6"}
/seven/{print "7"}
/eight/{print "8"}
/nine/{print "9"}

unswap:
/0/{print "zero"}
/1/{print "one"}
/2/{print "two"}
/3/{print "three"}
/4/{print "four"}
/5/{print "five"}
/6/{print "six"}
/7/{print "seven"}
/8/{print "eight"}
/9/{print "nine"}

output:

scrappy% ./evaluate one + two
three
scrappy% ./evaluate nine - four
five
scrappy% ./evaluate two * four
eight
scrappy% ./evaluate nine / three
three
Seems top me that if people are asked blind to produce some code for this probnlem, they may well write different stuff.

But, if they studied, took a quick look, at similar code that already did the job, they might think, well in my case below ..... oh, ok split the input string up into three strings, compare each of those strings with ascii text data to find a match, set some variables with the match results and perform the desrired operation. Oh, and he used strcmp, that's easy, I don't need to go check in my C-guide now ....

And then they write their code. No copy. No cut 'n paste, just see the basic outline of the idea and hack it out.

Et voila, betcha the code produced by another program was then similar, even though entirely written by the second programmer.

Why? Because the second programmer looked at the first programmers ideas, thought why bother reinventing the wheel for something so similar and something so trivial, worked out the ideas behind it and sat down and hacked out the code all by himself. Why even bother optimising? Don't need any speed here, who cares about memory usage. Wham bang done.
Code: Select all
			{
				char		str[] = "four times nine";
				char*		strptr;
				int			i;
				char		substring[3][10];
				char*		substringptr;
				char		numtext[9][8] = {"one","two","three","four","five","six","seven","eight","nine"};
				char		optext[3][8] = {"plus","minus","times"};
				int			x,y,z;
				int			result;

				strptr = &str[0];
				i = 0;
				do
				{
					substringptr = &substring[i][0];
					while ((*strptr != ' ') && (*strptr != NULL))
					{
						*substringptr = *strptr;
						strptr++;
						substringptr++;
					}
					*substringptr = NULL;
					strptr++;
					i++;
				} while (i<3);

				// get x=first num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[0][0]) == 0)
					{
						x=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get y=second num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[2][0]) == 0)
					{
						y=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get operation
				i = 0;
				do
				{
					if (strcmp(&optext[i][0],&substring[1][0]) == 0)
					{
						z=i;
						break;
					}
					i++;
				} while (i<9);

				if (z==0) result = x+y;
				if (z==1) result = x-y;
				if (z==2) result = x*y;
				result = result;
			}
RegicideX wrote:It is clear by now that the only piece of evidence worth taking seriously in the Rybka discussion so far is the UCI code -- at least the only public evidence.

But "worth taking seriously" does not mean that it is anywhere close to providing proof or conclusive evidence. The problems are that

1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.

2) The procedures presented are relatively short.

3) The reconstructed code is by no means identical -- there are many dissimilarities.

When two procedures are both relatively short and have extremely similar purpose in most chess engines, the probability of observing similarities is relatively large.

Furthermore

3) The compiling process takes away a lot of the individuality of the program due to various optimization procedures.

4) The conjectural reconstruction of the code can have a bias in the direction of proving similarities. You need to look at all possible ways of reconstructing the code in order to show that the initial source is a clone.

In order to advance the discussion we can perform a simple experiment. A few programmers here can write a simple parser for making simple arithmetic operations.

That is, the user enters a string consisting of literal strings like "one plus two" and the parser should print out the result of the arithmetic operation, in this case 3. To make matters simple, only one digit numbers and only one operation should be entered. Thus all operations should be of the form "X operation Y" where X and Y are literal representations of the numbers from zero to nine and "operation" should be "plus" "minus" and "times." An error message for invalid input could be present.

After writing the program, it should be compiled and then submitted for disassembling. Then we should compare the recreated codes among themselves and see how much similarity there is. If we observe a lot of similarity, comparable to the Rybka/ Fruit UCI parser similarity then the anti-Rybka case falls flat -- at least as far as the UCI code goes. If no two programs have significant similarity then the UCI code evidence gains more weight.

Of course, this requires some work -- and while I'm willing to write the source code, I am conveniently lacking expertise in disassembling so I can not participate there (which is the hardest part of this exercise).

But if we do have takers this would be one way to move the debate into more objective directions.
Seems to me that anyone looking at that code that you wrote would immediately think "no way, that's horribly long" and would not even look at the code in detail.

You made one bad assumption. "C". Nothing in his original post suggests C. Which means I would choose the most efficient tool I had at hand to make this work. It took 5 minutes total to write and debug the above. The only bug is that one needs to use "set noglob" in the shell they are using or the "*" on the command line will get turned into a filename "glob".

So far we have two approaches. I wonder if anyone else will bite. if you want this written in C, which was not originally stated as a requirement, I can probably do that in about 10 minutes. Actually, for the sake of argumen... this took me exactly 5 minutes to write:
Code: Select all
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
  char *words[10] = {"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
  int i, answer, operand1, operand2;
  for (i=0; i<10; i++) {
    if (!strcmp(argv[1], words[i]))  operand1 = i;
    if (!strcmp(argv[3], words[i]))  operand2 = i;
  }
  switch (argv[2][0]) {
    case '+':  answer = operand1 + operand2; break;
    case '-':  answer = operand1 - operand2; break;
    case '*':  answer = operand1 * operand2; break;
    case '/':  answer = operand1 / operand2; break;
    default: printf("invalid operator\n"); exit(1);
  }
  for (i=0; i<10; i++)
    if (i == answer) break;
  if (i < 10)
    printf("%s\n",words[i]);
  else
    printf("answer is > 9, no output produced\n");
}
here is the output:

scrappy% ./xpr three + five
eight
scrappy% ./xpr seven - three
four
scrappy% ./xpr two * four
eight
scrappy% ./xpr nine / three
three
scrappy% ./xpr four * four
answer is > 9, no output produced
scrappy% ./xpr two % three
invalid operator

Feel free to "compare" since I wasn't about to take the time to look at that mess you wrote... This might be cleaned up a bit to become simpler, still. BTW another 2 minutes and I can make it output results up to ninetynine.

bob · Post by **bob** » Sun Aug 31, 2008 9:40 pm

chrisw wrote:
RegicideX wrote:

All that would change would be to add a small awk script to replace every occurrence of "one" by "1", etc... before doing the same expr call... If you'd like to see the whole thing just let me know...
We're definitely not writing operating systems here. I am also not doubting that one can be unfathomably clever in writing a program -- a look at some "Obfuscated C" competitions should prove that.

But "normal" code writing can produce similar code. Chris W. also makes a good point that there can be a psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.
I'll post something (either code or executable, I haven't decided yet) after the long weekend -- family comes before internet discussions.
Yes, described much better than my long winded picture.

psychological anchoring effect of seeing and studying a piece of code.

Well done. Perfect expression and description. Fits exactly for the trivial pieces of code of the UCI.

No No No. Do you even know what UCI is? Handling the input/output is nowhere near as trivial as reading in three values, two of them numbers. There are multiple keywords. Multiple orders, multiple (required) values per line that can be derived in different ways. Free-format output. Etc.

bob · Post by **bob** » Sun Aug 31, 2008 9:42 pm

Alexander Schmidt wrote:
RegicideX wrote:1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.
OK, show us some similar code in, lets say, Crafty and Fruit. Or Glaurung and Slowchess. Or TSCP and Pepito.

I wait, ty.

Nobody will bite. This is supposedly a frequent occurrence, but only in the two programs being discussed (Fruit and Rybka).

chrisw · Post by **chrisw** » Sun Aug 31, 2008 9:43 pm

bob wrote:
chrisw wrote:
RegicideX wrote:

All that would change would be to add a small awk script to replace every occurrence of "one" by "1", etc... before doing the same expr call... If you'd like to see the whole thing just let me know...
We're definitely not writing operating systems here. I am also not doubting that one can be unfathomably clever in writing a program -- a look at some "Obfuscated C" competitions should prove that.

But "normal" code writing can produce similar code. Chris W. also makes a good point that there can be a psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.
I'll post something (either code or executable, I haven't decided yet) after the long weekend -- family comes before internet discussions.
Yes, described much better than my long winded picture.

psychological anchoring effect of seeing and studying a piece of code.

Well done. Perfect expression and description. Fits exactly for the trivial pieces of code of the UCI.
No No No. Do you even know what UCI is? Handling the input/output is nowhere near as trivial as reading in three values, two of them numbers. There are multiple keywords. Multiple orders, multiple (required) values per line that can be derived in different ways. Free-format output. Etc.

Yes Yes Yes.

Enough for one day

No doubt I will be able to read 20 gazillion fascinating Hyatt posts in the morning .... what fun

Alexander Schmidt · Post by **Alexander Schmidt** » Sun Aug 31, 2008 9:46 pm

bob wrote:[Nobody will bite.

Maybe if I ask it again and again?

chrisw · Post by **chrisw** » Sun Aug 31, 2008 9:47 pm

bob wrote:
RegicideX wrote:

All that would change would be to add a small awk script to replace every occurrence of "one" by "1", etc... before doing the same expr call... If you'd like to see the whole thing just let me know...
We're definitely not writing operating systems here. I am also not doubting that one can be unfathomably clever in writing a program -- a look at some "Obfuscated C" competitions should prove that.

But "normal" code writing can produce similar code. Chris W. also makes a good point that there can be a psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.

I'll post something (either code or executable, I haven't decided yet) after the long weekend -- family comes before internet discussions.
Sorry but that "structure" stuff is baloney. Way too many studies on what the human mind can "remember" without long-term memorization practices to force something into long-term memory. You might well remember overall structure "initialize stuff, input a move, do an iterated search that recursively calls an alpha/beta function, endpoints get the result of a static evaluation, etc..." but that will _never_ lead to duplicate code. Once I read all the stuff posted here, I will take a stab at comparing CW's code to mine, just to see how much similarity there is.. Ought to be interesting, even for an incredibly simple task such as the one you suggested...

Yours is deliberately designed to be as different to mine as you can get. Your participation as an experimenter is fatally flawed in this particular problem. N'est ce pas?

Can you spell bias?

bob · Post by **bob** » Sun Aug 31, 2008 9:50 pm

chrisw wrote:ROTFL!!!

This is NOT about who is the geekiest programmer, Bob!

You will of course write something as totally different to my load of old crappy cobblers because you have to, to prove your point. The reason you were being difficult was that you know perfectly well that Alex's experiment would leave you antis with a bad result.

SO, in other words, no matter what test someone devises, even one highly favorable to your perspective, the test can never be fair?

My code is not "geeky". It is about the most straight-forward, brain-dead approach anyone would use. Convert words to numbers. Perform the operation. Convert number back to a word. Don't like the switch, which I use in Crafty in many places? OK:

switch (argv[2][0]) {
case '+': answer = operand1 + operand2; break;
case '-': answer = operand1 - operand2; break;
case '*': answer = operand1 * operand2; break;
case '/': answer = operand1 / operand2; break;
default: printf("invalid operator\n"); exit(1);
}

becomes

if (!strcmp(argv[2], "+"))
answer = operand1 + operand2;
else if (!strcmp(argv[2], "="))
answer = operand1 - operand2;

etc.

I've programmed for a _long_ time. I won't say my first cut is ever horribly sloppy. And if the requirements had been specified a bit differently, I would have changed the way I wrote the code. Perhaps strtok() to parse the operands, rather than letting the shell parse them and passing me pointers to each one.

But this "moving target" has got to stop. You now have two sources to compare, three if you want to modify mine to eliminate that "geeky switch".

So get on about comparing them to point out all the duplicate code.

Nevertheless, I think it is now demonstrated, no matter how crap my code was, how inefficient, how it loses the geek competition, having seen it, and not caring for efficiency, time, memory, Alex's brilliant expression

psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.

advances massively the case that the trivial UCI code comparison is meaningless as a proof of any form of cut 'n paste copying.

You can say it, but it doesn't make it so. The term he is using is _not_ being applied anywhere near correctly. That is a "conceptual anchoring" effect, not a "programming technique anchoring.."

So get off that bandwagon, it isn't even going to leave the starting blocks.

bob wrote:
chrisw wrote:Hi Alex,

Well, unlike Bob, who seems to me to be just trying to be difficult, and in the interests of progress, I wrote some crappy code in C, appended below.
How am I trying to be difficult. You are making your usual assumptions. Nowhere did he mention "C". I did not quite catch the "one" vs "1". But here is my code to do the entire process:

"evaluate":
#!/bin/csh
set noglob
set v1 = `echo $1 | awk -f swap`
set v3 = `echo $3 | awk -f swap`
echo `expr $v1 $2 $v3 | awk -f unswap`

swap:
/zero/{print "0"}
/one/{print "1"}
/two/{print "2"}
/three/{print "3"}
/four/{print "4"}
/five/{print "5"}
/six/{print "6"}
/seven/{print "7"}
/eight/{print "8"}
/nine/{print "9"}

unswap:
/0/{print "zero"}
/1/{print "one"}
/2/{print "two"}
/3/{print "three"}
/4/{print "four"}
/5/{print "five"}
/6/{print "six"}
/7/{print "seven"}
/8/{print "eight"}
/9/{print "nine"}

output:

scrappy% ./evaluate one + two
three
scrappy% ./evaluate nine - four
five
scrappy% ./evaluate two * four
eight
scrappy% ./evaluate nine / three
three
Seems top me that if people are asked blind to produce some code for this probnlem, they may well write different stuff.

But, if they studied, took a quick look, at similar code that already did the job, they might think, well in my case below ..... oh, ok split the input string up into three strings, compare each of those strings with ascii text data to find a match, set some variables with the match results and perform the desrired operation. Oh, and he used strcmp, that's easy, I don't need to go check in my C-guide now ....

And then they write their code. No copy. No cut 'n paste, just see the basic outline of the idea and hack it out.

Et voila, betcha the code produced by another program was then similar, even though entirely written by the second programmer.

Why? Because the second programmer looked at the first programmers ideas, thought why bother reinventing the wheel for something so similar and something so trivial, worked out the ideas behind it and sat down and hacked out the code all by himself. Why even bother optimising? Don't need any speed here, who cares about memory usage. Wham bang done.
Code: Select all
			{
				char		str[] = "four times nine";
				char*		strptr;
				int			i;
				char		substring[3][10];
				char*		substringptr;
				char		numtext[9][8] = {"one","two","three","four","five","six","seven","eight","nine"};
				char		optext[3][8] = {"plus","minus","times"};
				int			x,y,z;
				int			result;

				strptr = &str[0];
				i = 0;
				do
				{
					substringptr = &substring[i][0];
					while ((*strptr != ' ') && (*strptr != NULL))
					{
						*substringptr = *strptr;
						strptr++;
						substringptr++;
					}
					*substringptr = NULL;
					strptr++;
					i++;
				} while (i<3);

				// get x=first num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[0][0]) == 0)
					{
						x=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get y=second num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[2][0]) == 0)
					{
						y=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get operation
				i = 0;
				do
				{
					if (strcmp(&optext[i][0],&substring[1][0]) == 0)
					{
						z=i;
						break;
					}
					i++;
				} while (i<9);

				if (z==0) result = x+y;
				if (z==1) result = x-y;
				if (z==2) result = x*y;
				result = result;
			}
RegicideX wrote:It is clear by now that the only piece of evidence worth taking seriously in the Rybka discussion so far is the UCI code -- at least the only public evidence.

But "worth taking seriously" does not mean that it is anywhere close to providing proof or conclusive evidence. The problems are that

1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.

2) The procedures presented are relatively short.

3) The reconstructed code is by no means identical -- there are many dissimilarities.

When two procedures are both relatively short and have extremely similar purpose in most chess engines, the probability of observing similarities is relatively large.

Furthermore

3) The compiling process takes away a lot of the individuality of the program due to various optimization procedures.

4) The conjectural reconstruction of the code can have a bias in the direction of proving similarities. You need to look at all possible ways of reconstructing the code in order to show that the initial source is a clone.

In order to advance the discussion we can perform a simple experiment. A few programmers here can write a simple parser for making simple arithmetic operations.

That is, the user enters a string consisting of literal strings like "one plus two" and the parser should print out the result of the arithmetic operation, in this case 3. To make matters simple, only one digit numbers and only one operation should be entered. Thus all operations should be of the form "X operation Y" where X and Y are literal representations of the numbers from zero to nine and "operation" should be "plus" "minus" and "times." An error message for invalid input could be present.

After writing the program, it should be compiled and then submitted for disassembling. Then we should compare the recreated codes among themselves and see how much similarity there is. If we observe a lot of similarity, comparable to the Rybka/ Fruit UCI parser similarity then the anti-Rybka case falls flat -- at least as far as the UCI code goes. If no two programs have significant similarity then the UCI code evidence gains more weight.

Of course, this requires some work -- and while I'm willing to write the source code, I am conveniently lacking expertise in disassembling so I can not participate there (which is the hardest part of this exercise).

But if we do have takers this would be one way to move the debate into more objective directions.
Seems to me that anyone looking at that code that you wrote would immediately think "no way, that's horribly long" and would not even look at the code in detail.

You made one bad assumption. "C". Nothing in his original post suggests C. Which means I would choose the most efficient tool I had at hand to make this work. It took 5 minutes total to write and debug the above. The only bug is that one needs to use "set noglob" in the shell they are using or the "*" on the command line will get turned into a filename "glob".

So far we have two approaches. I wonder if anyone else will bite. if you want this written in C, which was not originally stated as a requirement, I can probably do that in about 10 minutes. Actually, for the sake of argumen... this took me exactly 5 minutes to write:
Code: Select all
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
  char *words[10] = {"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
  int i, answer, operand1, operand2;
  for (i=0; i<10; i++) {
    if (!strcmp(argv[1], words[i]))  operand1 = i;
    if (!strcmp(argv[3], words[i]))  operand2 = i;
  }
  switch (argv[2][0]) {
    case '+':  answer = operand1 + operand2; break;
    case '-':  answer = operand1 - operand2; break;
    case '*':  answer = operand1 * operand2; break;
    case '/':  answer = operand1 / operand2; break;
    default: printf("invalid operator\n"); exit(1);
  }
  for (i=0; i<10; i++)
    if (i == answer) break;
  if (i < 10)
    printf("%s\n",words[i]);
  else
    printf("answer is > 9, no output produced\n");
}
here is the output:

scrappy% ./xpr three + five
eight
scrappy% ./xpr seven - three
four
scrappy% ./xpr two * four
eight
scrappy% ./xpr nine / three
three
scrappy% ./xpr four * four
answer is > 9, no output produced
scrappy% ./xpr two % three
invalid operator

Feel free to "compare" since I wasn't about to take the time to look at that mess you wrote... This might be cleaned up a bit to become simpler, still. BTW another 2 minutes and I can make it output results up to ninetynine.

chrisw · Post by **chrisw** » Sun Aug 31, 2008 9:50 pm

bob wrote:
Alexander Schmidt wrote:
RegicideX wrote:1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.
OK, show us some similar code in, lets say, Crafty and Fruit. Or Glaurung and Slowchess. Or TSCP and Pepito.

I wait, ty.
Nobody will bite. This is supposedly a frequent occurrence, but only in the two programs being discussed (Fruit and Rybka).

As Alex's expression psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head demonstrates is a perfectly frequent, natural and entirely legal occurence when the second programmer read, studied and absorbed the code of the first.

bob · Post by **bob** » Sun Aug 31, 2008 9:51 pm

chrisw wrote:
bob wrote:
chrisw wrote:
RegicideX wrote:

All that would change would be to add a small awk script to replace every occurrence of "one" by "1", etc... before doing the same expr call... If you'd like to see the whole thing just let me know...
We're definitely not writing operating systems here. I am also not doubting that one can be unfathomably clever in writing a program -- a look at some "Obfuscated C" competitions should prove that.

But "normal" code writing can produce similar code. Chris W. also makes a good point that there can be a psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.
I'll post something (either code or executable, I haven't decided yet) after the long weekend -- family comes before internet discussions.
Yes, described much better than my long winded picture.

psychological anchoring effect of seeing and studying a piece of code.

Well done. Perfect expression and description. Fits exactly for the trivial pieces of code of the UCI.
No No No. Do you even know what UCI is? Handling the input/output is nowhere near as trivial as reading in three values, two of them numbers. There are multiple keywords. Multiple orders, multiple (required) values per line that can be derived in different ways. Free-format output. Etc.
Yes Yes Yes.

Enough for one day

No doubt I will be able to read 20 gazillion fascinating Hyatt posts in the morning .... what fun

As opposed to 20 gazillion irrelevant CW posts? what fun, indeed...

chrisw · Post by **chrisw** » Sun Aug 31, 2008 9:54 pm

What's the point to compare the code of someone so biased he won't even accept to make a statement that Rybka3 is under no threat in Beijing?

How would anyone imagine you can participate in this experiment? Of course you'll produce code as different as it is possible to get.

Can you spell bias? Bob?

bob wrote:
chrisw wrote:ROTFL!!!

This is NOT about who is the geekiest programmer, Bob!

You will of course write something as totally different to my load of old crappy cobblers because you have to, to prove your point. The reason you were being difficult was that you know perfectly well that Alex's experiment would leave you antis with a bad result.
SO, in other words, no matter what test someone devises, even one highly favorable to your perspective, the test can never be fair?

My code is not "geeky". It is about the most straight-forward, brain-dead approach anyone would use. Convert words to numbers. Perform the operation. Convert number back to a word. Don't like the switch, which I use in Crafty in many places? OK:

switch (argv[2][0]) {
case '+': answer = operand1 + operand2; break;
case '-': answer = operand1 - operand2; break;
case '*': answer = operand1 * operand2; break;
case '/': answer = operand1 / operand2; break;
default: printf("invalid operator\n"); exit(1);
}

becomes

if (!strcmp(argv[2], "+"))
answer = operand1 + operand2;
else if (!strcmp(argv[2], "="))
answer = operand1 - operand2;

etc.

I've programmed for a _long_ time. I won't say my first cut is ever horribly sloppy. And if the requirements had been specified a bit differently, I would have changed the way I wrote the code. Perhaps strtok() to parse the operands, rather than letting the shell parse them and passing me pointers to each one.

But this "moving target" has got to stop. You now have two sources to compare, three if you want to modify mine to eliminate that "geeky switch".

So get on about comparing them to point out all the duplicate code.

Nevertheless, I think it is now demonstrated, no matter how crap my code was, how inefficient, how it loses the geek competition, having seen it, and not caring for efficiency, time, memory, Alex's brilliant expression

psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head.

advances massively the case that the trivial UCI code comparison is meaningless as a proof of any form of cut 'n paste copying.

You can say it, but it doesn't make it so. The term he is using is _not_ being applied anywhere near correctly. That is a "conceptual anchoring" effect, not a "programming technique anchoring.."

So get off that bandwagon, it isn't even going to leave the starting blocks.
bob wrote:
chrisw wrote:Hi Alex,

Well, unlike Bob, who seems to me to be just trying to be difficult, and in the interests of progress, I wrote some crappy code in C, appended below.
How am I trying to be difficult. You are making your usual assumptions. Nowhere did he mention "C". I did not quite catch the "one" vs "1". But here is my code to do the entire process:

"evaluate":
#!/bin/csh
set noglob
set v1 = `echo $1 | awk -f swap`
set v3 = `echo $3 | awk -f swap`
echo `expr $v1 $2 $v3 | awk -f unswap`

swap:
/zero/{print "0"}
/one/{print "1"}
/two/{print "2"}
/three/{print "3"}
/four/{print "4"}
/five/{print "5"}
/six/{print "6"}
/seven/{print "7"}
/eight/{print "8"}
/nine/{print "9"}

unswap:
/0/{print "zero"}
/1/{print "one"}
/2/{print "two"}
/3/{print "three"}
/4/{print "four"}
/5/{print "five"}
/6/{print "six"}
/7/{print "seven"}
/8/{print "eight"}
/9/{print "nine"}

output:

scrappy% ./evaluate one + two
three
scrappy% ./evaluate nine - four
five
scrappy% ./evaluate two * four
eight
scrappy% ./evaluate nine / three
three
Seems top me that if people are asked blind to produce some code for this probnlem, they may well write different stuff.

But, if they studied, took a quick look, at similar code that already did the job, they might think, well in my case below ..... oh, ok split the input string up into three strings, compare each of those strings with ascii text data to find a match, set some variables with the match results and perform the desrired operation. Oh, and he used strcmp, that's easy, I don't need to go check in my C-guide now ....

And then they write their code. No copy. No cut 'n paste, just see the basic outline of the idea and hack it out.

Et voila, betcha the code produced by another program was then similar, even though entirely written by the second programmer.

Why? Because the second programmer looked at the first programmers ideas, thought why bother reinventing the wheel for something so similar and something so trivial, worked out the ideas behind it and sat down and hacked out the code all by himself. Why even bother optimising? Don't need any speed here, who cares about memory usage. Wham bang done.
Code: Select all
			{
				char		str[] = "four times nine";
				char*		strptr;
				int			i;
				char		substring[3][10];
				char*		substringptr;
				char		numtext[9][8] = {"one","two","three","four","five","six","seven","eight","nine"};
				char		optext[3][8] = {"plus","minus","times"};
				int			x,y,z;
				int			result;

				strptr = &str[0];
				i = 0;
				do
				{
					substringptr = &substring[i][0];
					while ((*strptr != ' ') && (*strptr != NULL))
					{
						*substringptr = *strptr;
						strptr++;
						substringptr++;
					}
					*substringptr = NULL;
					strptr++;
					i++;
				} while (i<3);

				// get x=first num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[0][0]) == 0)
					{
						x=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get y=second num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[2][0]) == 0)
					{
						y=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get operation
				i = 0;
				do
				{
					if (strcmp(&optext[i][0],&substring[1][0]) == 0)
					{
						z=i;
						break;
					}
					i++;
				} while (i<9);

				if (z==0) result = x+y;
				if (z==1) result = x-y;
				if (z==2) result = x*y;
				result = result;
			}
RegicideX wrote:It is clear by now that the only piece of evidence worth taking seriously in the Rybka discussion so far is the UCI code -- at least the only public evidence.

But "worth taking seriously" does not mean that it is anywhere close to providing proof or conclusive evidence. The problems are that

1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.

2) The procedures presented are relatively short.

3) The reconstructed code is by no means identical -- there are many dissimilarities.

When two procedures are both relatively short and have extremely similar purpose in most chess engines, the probability of observing similarities is relatively large.

Furthermore

3) The compiling process takes away a lot of the individuality of the program due to various optimization procedures.

4) The conjectural reconstruction of the code can have a bias in the direction of proving similarities. You need to look at all possible ways of reconstructing the code in order to show that the initial source is a clone.

In order to advance the discussion we can perform a simple experiment. A few programmers here can write a simple parser for making simple arithmetic operations.

That is, the user enters a string consisting of literal strings like "one plus two" and the parser should print out the result of the arithmetic operation, in this case 3. To make matters simple, only one digit numbers and only one operation should be entered. Thus all operations should be of the form "X operation Y" where X and Y are literal representations of the numbers from zero to nine and "operation" should be "plus" "minus" and "times." An error message for invalid input could be present.

After writing the program, it should be compiled and then submitted for disassembling. Then we should compare the recreated codes among themselves and see how much similarity there is. If we observe a lot of similarity, comparable to the Rybka/ Fruit UCI parser similarity then the anti-Rybka case falls flat -- at least as far as the UCI code goes. If no two programs have significant similarity then the UCI code evidence gains more weight.

Of course, this requires some work -- and while I'm willing to write the source code, I am conveniently lacking expertise in disassembling so I can not participate there (which is the hardest part of this exercise).

But if we do have takers this would be one way to move the debate into more objective directions.
Seems to me that anyone looking at that code that you wrote would immediately think "no way, that's horribly long" and would not even look at the code in detail.

You made one bad assumption. "C". Nothing in his original post suggests C. Which means I would choose the most efficient tool I had at hand to make this work. It took 5 minutes total to write and debug the above. The only bug is that one needs to use "set noglob" in the shell they are using or the "*" on the command line will get turned into a filename "glob".

So far we have two approaches. I wonder if anyone else will bite. if you want this written in C, which was not originally stated as a requirement, I can probably do that in about 10 minutes. Actually, for the sake of argumen... this took me exactly 5 minutes to write:
Code: Select all
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
  char *words[10] = {"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
  int i, answer, operand1, operand2;
  for (i=0; i<10; i++) {
    if (!strcmp(argv[1], words[i]))  operand1 = i;
    if (!strcmp(argv[3], words[i]))  operand2 = i;
  }
  switch (argv[2][0]) {
    case '+':  answer = operand1 + operand2; break;
    case '-':  answer = operand1 - operand2; break;
    case '*':  answer = operand1 * operand2; break;
    case '/':  answer = operand1 / operand2; break;
    default: printf("invalid operator\n"); exit(1);
  }
  for (i=0; i<10; i++)
    if (i == answer) break;
  if (i < 10)
    printf("%s\n",words[i]);
  else
    printf("answer is > 9, no output produced\n");
}
here is the output:

scrappy% ./xpr three + five
eight
scrappy% ./xpr seven - three
four
scrappy% ./xpr two * four
eight
scrappy% ./xpr nine / three
three
scrappy% ./xpr four * four
answer is > 9, no output produced
scrappy% ./xpr two % three
invalid operator

Feel free to "compare" since I wasn't about to take the time to look at that mess you wrote... This might be cleaned up a bit to become simpler, still. BTW another 2 minutes and I can make it output results up to ninetynine.

A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion