A Simple Experiment for Advancing the Discussion

bob · Post by **bob** » Sun Aug 31, 2008 10:24 pm

chrisw wrote:
bob wrote:It appears to me that the only "psychological anchoring" going on is that you now have that expression etched into your head and are going to regurgitate it from here on, even though it does _not_ apply to specific programming details in any way.

But don't let a few facts stop you, you haven't yet...
Tell you what, Bob. I'll try to use it as many times as you fill sentences with the word "bullshit". I prefer useful language that says something, not rude "bullshit", if you get my meaning.

I just try to "call it as I see it." And not steal words from other posters, much less copy code...

chrisw · Post by **chrisw** » Sun Aug 31, 2008 10:26 pm

ROTFL!!

You can't participate in an experiment about writing code to see similarities/differences when you've already seen the first piece of code, Bob.

Can you spell bias?

Can you parse "experimenter affecting his own results"?

It's now a thought experiment. Can you spell Einstein?

Can you spell science?

bob wrote:
chrisw wrote:
Code: Select all
			{
				char		str[] = "four times nine";
				char*		strptr;
				int			i;
				char		substring[3][10];
				char*		substringptr;
				char		numtext[9][8] = {"one","two","three","four","five","six","seven","eight","nine"};
				char		optext[3][8] = {"plus","minus","times"};
				int			x,y,z;
				int			result;

				strptr = &str[0];
				i = 0;
				do
				{
					substringptr = &substring[i][0];
					while ((*strptr != ' ') && (*strptr != NULL))
					{
						*substringptr = *strptr;
						strptr++;
						substringptr++;
					}
					*substringptr = NULL;
					strptr++;
					i++;
				} while (i<3);
I thought I would analyze things to show just how _different_ the two approaches presented so far are, and remember that there is nothing in a chess engine that is this simple.

The above parses the input string, which was hard-coded into the program, into three separate strings in an array containing three strings.

I don't have any equivalent code as I chose to let the command-line parsing of any shell take care of that and I simply used the classic argv[] facility in ANSI C to asscess the three strings as argv[1], argv[2] and argv[3]. Big difference already.
Code: Select all
				// get x=first num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[0][0]) == 0)
					{
						x=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get y=second num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[2][0]) == 0)
					{
						y=i+1;
						break;
					}
					i++;
				} while (i<9);
That rather convoluted bit of programming looks up the first operand "word" to convert it to a number. My approach:
Code: Select all
  char *words[10] = {"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
  int i, answer, operand1, operand2;
  for (i=0; i<10; i++) {
    if (!strcmp(argv[1], words[i]))  operand1 = i;
    if (!strcmp(argv[3], words[i]))  operand2 = i;
  }
Does the same thing, much simpler, shorter, easier to understand. And nothing like the code it is equivalent to.
Code: Select all
		// get operation
				i = 0;
				do
				{
					if (strcmp(&optext[i][0],&substring[1][0]) == 0)
					{
						z=i;
						break;
					}
					i++;
				} while (i<9);

				if (z==0) result = x+y;
				if (z==1) result = x-y;
				if (z==2) result = x*y;
				result = result;
That determines what the operator is by matching against an array. Since I understood the original request to use "+" and such, I used a simple switch:
Code: Select all
	  switch (argv[2][0]) {
    case '+':  answer = operand1 + operand2; break;
    case '-':  answer = operand1 - operand2; break;
    case '*':  answer = operand1 * operand2; break;
    case '/':  answer = operand1 / operand2; break;
    default: printf("invalid operator\n"); exit(1);
  }
to carry out the operation and complain if an invalid operation was given.

And somehow, in doing this, the last bit of CW's code was lost (I suppose) in that I can't find what was used to print the final answer. In looking back, it appears he did not do this anyway. Here is mine:
Code: Select all
  for (i=0; i<10; i++)
    if (i == answer) break;
  if (i < 10)
    printf("%s\n",words[i]);
  else
    printf("answer is > 9, no output produced\n");
To show how quickly this was written, the above loop is not needed. That code should be:

if (answer < 10)
printf("%s", words[answer]);
else
...
Code: Select all
		}
RegicideX wrote:It is clear by now that the only piece of evidence worth taking seriously in the Rybka discussion so far is the UCI code -- at least the only public evidence.

But "worth taking seriously" does not mean that it is anywhere close to providing proof or conclusive evidence. The problems are that

1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.

2) The procedures presented are relatively short.

3) The reconstructed code is by no means identical -- there are many dissimilarities.

When two procedures are both relatively short and have extremely similar purpose in most chess engines, the probability of observing similarities is relatively large.

Furthermore

3) The compiling process takes away a lot of the individuality of the program due to various optimization procedures.

4) The conjectural reconstruction of the code can have a bias in the direction of proving similarities. You need to look at all possible ways of reconstructing the code in order to show that the initial source is a clone.

In order to advance the discussion we can perform a simple experiment. A few programmers here can write a simple parser for making simple arithmetic operations.

That is, the user enters a string consisting of literal strings like "one plus two" and the parser should print out the result of the arithmetic operation, in this case 3. To make matters simple, only one digit numbers and only one operation should be entered. Thus all operations should be of the form "X operation Y" where X and Y are literal representations of the numbers from zero to nine and "operation" should be "plus" "minus" and "times." An error message for invalid input could be present.

After writing the program, it should be compiled and then submitted for disassembling. Then we should compare the recreated codes among themselves and see how much similarity there is. If we observe a lot of similarity, comparable to the Rybka/ Fruit UCI parser similarity then the anti-Rybka case falls flat -- at least as far as the UCI code goes. If no two programs have significant similarity then the UCI code evidence gains more weight.

Of course, this requires some work -- and while I'm willing to write the source code, I am conveniently lacking expertise in disassembling so I can not participate there (which is the hardest part of this exercise).

But if we do have takers this would be one way to move the debate into more objective directions.
So, two possible instances of a program that does a very simple operation. And about the only thing they have in common is the strings "one", "two" stuck in an array. And even that was done differently source-wise.

So who _really_ thinks that independent programmers are going to produce the same code for complex functions? It just doesn't happen.

chrisw · Post by **chrisw** » Sun Aug 31, 2008 10:29 pm

bob wrote:
chrisw wrote:
bob wrote:It appears to me that the only "psychological anchoring" going on is that you now have that expression etched into your head and are going to regurgitate it from here on, even though it does _not_ apply to specific programming details in any way.

But don't let a few facts stop you, you haven't yet...
Tell you what, Bob. I'll try to use it as many times as you fill sentences with the word "bullshit". I prefer useful language that says something, not rude "bullshit", if you get my meaning.
I just try to "call it as I see it." And not steal words from other posters, much less copy code...

If that's what you see, where's your head?

bob · Post by **bob** » Sun Aug 31, 2008 10:29 pm

chrisw wrote:
bob wrote:
chrisw wrote:
bob wrote:
Alexander Schmidt wrote:
RegicideX wrote:1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.
OK, show us some similar code in, lets say, Crafty and Fruit. Or Glaurung and Slowchess. Or TSCP and Pepito.

I wait, ty.
Nobody will bite. This is supposedly a frequent occurrence, but only in the two programs being discussed (Fruit and Rybka).
As Alex's expression psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head demonstrates is a perfectly frequent, natural and entirely legal occurence when the second programmer read, studied and absorbed the code of the first.
The only "anchoring effect" is that you now have "anchored" next to yet another bullshit explanation. you don't anchor to "specific algorithm design and code". That is so far beyond bullshit it is an insult to bullshit. High-level conceptual overviews? Of course. But not low-level implementations, until perhaps you have actually implemented one. Then you might tend to do it the same way next time, if you are a beginner. if you are sophisticated you are always looking for _better_ ways, not "familiar" ways.
Yup. The essential importance of optimisation to geekiest potential of the UCI code. How important it is. Takes so much computational time, so much memory. Not.

Let Vas spend his efforts bringing UCI code up to Bob standards. Such an important use of time. Not.

I can tell you one thing for sure. If that is your _real_ attitude, and you worked for me, you would be gone.

UCI is unimportant? And when it changes, as we had in xboard protocol version 2, and eventually version 3? I want _clean_ code to go back and look at since it has to be modified. I don't want a program that is 10X larger than it needs to be, with convoluted coding that is hard to follow. That approach is so anti-software-engineering it is hard to believe it came from someone that was a commercial chess engine programmer. I would _hate_ to look at your old code. You can look at mine at any time since it is public.

So I guess I have learned something today. Don't give any thought to writing a program before starting. Just grab the keyboard and hope you type something that can be modified to accomplish whatever task was given. Don't consider understandability. Don't consider efficiency. Just write. And debug. And debug. And debug. And debug.

My program worked the first time. One compiler warning when I forgot to include stdio.h to provide the printf prototype. Perhaps you should apply at Microsoft. They seem to appreciate that kind of programming. I prefer to associate with the linux kernel guys and such, who write code more compatible with my standards...

bob · Post by **bob** » Sun Aug 31, 2008 10:32 pm

That's what I expected. No matter what happens, it won't prove a thing. If 10 people participate, it won't be enough. It won't be enough until we get enough monkeys in the room so that two programs actually match byte by byte, I would assume.

I'm not holding my breath.

And I am convinced I wasted a few minutes today writing code that was completely pointless (as I knew it was, but I tried anyway).

bob · Post by **bob** » Sun Aug 31, 2008 10:33 pm

chrisw wrote:
bob wrote:
chrisw wrote:
bob wrote:It appears to me that the only "psychological anchoring" going on is that you now have that expression etched into your head and are going to regurgitate it from here on, even though it does _not_ apply to specific programming details in any way.

But don't let a few facts stop you, you haven't yet...
Tell you what, Bob. I'll try to use it as many times as you fill sentences with the word "bullshit". I prefer useful language that says something, not rude "bullshit", if you get my meaning.
I just try to "call it as I see it." And not steal words from other posters, much less copy code...
If that's what you see, where's your head?

If you would get your head out of the bodily orifice you have it stuck in, you would be able to see _my_ head...

chrisw · Post by **chrisw** » Sun Aug 31, 2008 10:34 pm

bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
bob wrote:
Alexander Schmidt wrote:
RegicideX wrote:1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.
OK, show us some similar code in, lets say, Crafty and Fruit. Or Glaurung and Slowchess. Or TSCP and Pepito.

I wait, ty.
Nobody will bite. This is supposedly a frequent occurrence, but only in the two programs being discussed (Fruit and Rybka).
As Alex's expression psychological "anchoring effect" of seeing and studying a piece of code -- the code structure sticks in one's head demonstrates is a perfectly frequent, natural and entirely legal occurence when the second programmer read, studied and absorbed the code of the first.
The only "anchoring effect" is that you now have "anchored" next to yet another bullshit explanation. you don't anchor to "specific algorithm design and code". That is so far beyond bullshit it is an insult to bullshit. High-level conceptual overviews? Of course. But not low-level implementations, until perhaps you have actually implemented one. Then you might tend to do it the same way next time, if you are a beginner. if you are sophisticated you are always looking for _better_ ways, not "familiar" ways.
Yup. The essential importance of optimisation to geekiest potential of the UCI code. How important it is. Takes so much computational time, so much memory. Not.

Let Vas spend his efforts bringing UCI code up to Bob standards. Such an important use of time. Not.
I can tell you one thing for sure. If that is your _real_ attitude, and you worked for me, you would be gone.

Perhaps you should apply at Microsoft. They seem to appreciate that kind of programming. I prefer to associate with the linux kernel guys and such, who write code more compatible with my standards...

Bob,

I didn't work for anyone for well over thirty years. More likely it would have a question of you working for me, if not for the fact, in my ever so humble opinion, that you're unemployable.

Best wishes and happy pension regards,

Uri Blass · Post by **Uri Blass** » Sun Aug 31, 2008 10:36 pm

bob wrote:
chrisw wrote:Hi Alex,

Well, unlike Bob, who seems to me to be just trying to be difficult, and in the interests of progress, I wrote some crappy code in C, appended below.
How am I trying to be difficult. You are making your usual assumptions. Nowhere did he mention "C". I did not quite catch the "one" vs "1". But here is my code to do the entire process:

"evaluate":
#!/bin/csh
set noglob
set v1 = `echo $1 | awk -f swap`
set v3 = `echo $3 | awk -f swap`
echo `expr $v1 $2 $v3 | awk -f unswap`

swap:
/zero/{print "0"}
/one/{print "1"}
/two/{print "2"}
/three/{print "3"}
/four/{print "4"}
/five/{print "5"}
/six/{print "6"}
/seven/{print "7"}
/eight/{print "8"}
/nine/{print "9"}

unswap:
/0/{print "zero"}
/1/{print "one"}
/2/{print "two"}
/3/{print "three"}
/4/{print "four"}
/5/{print "five"}
/6/{print "six"}
/7/{print "seven"}
/8/{print "eight"}
/9/{print "nine"}

output:

scrappy% ./evaluate one + two
three
scrappy% ./evaluate nine - four
five
scrappy% ./evaluate two * four
eight
scrappy% ./evaluate nine / three
three
Seems top me that if people are asked blind to produce some code for this probnlem, they may well write different stuff.

But, if they studied, took a quick look, at similar code that already did the job, they might think, well in my case below ..... oh, ok split the input string up into three strings, compare each of those strings with ascii text data to find a match, set some variables with the match results and perform the desrired operation. Oh, and he used strcmp, that's easy, I don't need to go check in my C-guide now ....

And then they write their code. No copy. No cut 'n paste, just see the basic outline of the idea and hack it out.

Et voila, betcha the code produced by another program was then similar, even though entirely written by the second programmer.

Why? Because the second programmer looked at the first programmers ideas, thought why bother reinventing the wheel for something so similar and something so trivial, worked out the ideas behind it and sat down and hacked out the code all by himself. Why even bother optimising? Don't need any speed here, who cares about memory usage. Wham bang done.
Code: Select all
			{
				char		str[] = "four times nine";
				char*		strptr;
				int			i;
				char		substring[3][10];
				char*		substringptr;
				char		numtext[9][8] = {"one","two","three","four","five","six","seven","eight","nine"};
				char		optext[3][8] = {"plus","minus","times"};
				int			x,y,z;
				int			result;

				strptr = &str[0];
				i = 0;
				do
				{
					substringptr = &substring[i][0];
					while ((*strptr != ' ') && (*strptr != NULL))
					{
						*substringptr = *strptr;
						strptr++;
						substringptr++;
					}
					*substringptr = NULL;
					strptr++;
					i++;
				} while (i<3);

				// get x=first num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[0][0]) == 0)
					{
						x=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get y=second num
				i = 0;
				do
				{
					if (strcmp(&numtext[i][0],&substring[2][0]) == 0)
					{
						y=i+1;
						break;
					}
					i++;
				} while (i<9);

				// get operation
				i = 0;
				do
				{
					if (strcmp(&optext[i][0],&substring[1][0]) == 0)
					{
						z=i;
						break;
					}
					i++;
				} while (i<9);

				if (z==0) result = x+y;
				if (z==1) result = x-y;
				if (z==2) result = x*y;
				result = result;
			}
RegicideX wrote:It is clear by now that the only piece of evidence worth taking seriously in the Rybka discussion so far is the UCI code -- at least the only public evidence.

But "worth taking seriously" does not mean that it is anywhere close to providing proof or conclusive evidence. The problems are that

1) The functions being compared are very similar in purpose and they are similar in purpose in most if not all chess programs.

2) The procedures presented are relatively short.

3) The reconstructed code is by no means identical -- there are many dissimilarities.

When two procedures are both relatively short and have extremely similar purpose in most chess engines, the probability of observing similarities is relatively large.

Furthermore

3) The compiling process takes away a lot of the individuality of the program due to various optimization procedures.

4) The conjectural reconstruction of the code can have a bias in the direction of proving similarities. You need to look at all possible ways of reconstructing the code in order to show that the initial source is a clone.

In order to advance the discussion we can perform a simple experiment. A few programmers here can write a simple parser for making simple arithmetic operations.

That is, the user enters a string consisting of literal strings like "one plus two" and the parser should print out the result of the arithmetic operation, in this case 3. To make matters simple, only one digit numbers and only one operation should be entered. Thus all operations should be of the form "X operation Y" where X and Y are literal representations of the numbers from zero to nine and "operation" should be "plus" "minus" and "times." An error message for invalid input could be present.

After writing the program, it should be compiled and then submitted for disassembling. Then we should compare the recreated codes among themselves and see how much similarity there is. If we observe a lot of similarity, comparable to the Rybka/ Fruit UCI parser similarity then the anti-Rybka case falls flat -- at least as far as the UCI code goes. If no two programs have significant similarity then the UCI code evidence gains more weight.

Of course, this requires some work -- and while I'm willing to write the source code, I am conveniently lacking expertise in disassembling so I can not participate there (which is the hardest part of this exercise).

But if we do have takers this would be one way to move the debate into more objective directions.
Seems to me that anyone looking at that code that you wrote would immediately think "no way, that's horribly long" and would not even look at the code in detail.

You made one bad assumption. "C". Nothing in his original post suggests C. Which means I would choose the most efficient tool I had at hand to make this work. It took 5 minutes total to write and debug the above. The only bug is that one needs to use "set noglob" in the shell they are using or the "*" on the command line will get turned into a filename "glob".

So far we have two approaches. I wonder if anyone else will bite. if you want this written in C, which was not originally stated as a requirement, I can probably do that in about 10 minutes. Actually, for the sake of argumen... this took me exactly 5 minutes to write:
Code: Select all
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
  char *words[10] = {"zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine"};
  int i, answer, operand1, operand2;
  for (i=0; i<10; i++) {
    if (!strcmp(argv[1], words[i]))  operand1 = i;
    if (!strcmp(argv[3], words[i]))  operand2 = i;
  }
  switch (argv[2][0]) {
    case '+':  answer = operand1 + operand2; break;
    case '-':  answer = operand1 - operand2; break;
    case '*':  answer = operand1 * operand2; break;
    case '/':  answer = operand1 / operand2; break;
    default: printf("invalid operator\n"); exit(1);
  }
  for (i=0; i<10; i++)
    if (i == answer) break;
  if (i < 10)
    printf("%s\n",words[i]);
  else
    printf("answer is > 9, no output produced\n");
}
here is the output:

scrappy% ./xpr three + five
eight
scrappy% ./xpr seven - three
four
scrappy% ./xpr two * four
eight
scrappy% ./xpr nine / three
three
scrappy% ./xpr four * four
answer is > 9, no output produced
scrappy% ./xpr two % three
invalid operator

Feel free to "compare" since I wasn't about to take the time to look at that mess you wrote... This might be cleaned up a bit to become simpler, still. BTW another 2 minutes and I can make it output results up to ninetynine.

If I try to run your code on my machine I get the following warning

warning C4013: 'strcmp' undefined; assuming extern returning int

I can add #include <string.h> and avoid that warning but I do not know how to run your program without error probably because I do not know how to run main from the command line.

I never used argc and argv as operators for main and I do not understand how they are used.
I remember reading that main get the arguments from command line but I believe that last time when I ran a program from the command line was some years ago so I do not remember it and I wonder if I need to do something like that instead of running the project normally.

Uri

chrisw · Post by **chrisw** » Sun Aug 31, 2008 10:37 pm

bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
bob wrote:It appears to me that the only "psychological anchoring" going on is that you now have that expression etched into your head and are going to regurgitate it from here on, even though it does _not_ apply to specific programming details in any way.

But don't let a few facts stop you, you haven't yet...
Tell you what, Bob. I'll try to use it as many times as you fill sentences with the word "bullshit". I prefer useful language that says something, not rude "bullshit", if you get my meaning.
I just try to "call it as I see it." And not steal words from other posters, much less copy code...
If that's what you see, where's your head?
If you would get your head out of the bodily orifice you have it stuck in, you would be able to see _my_ head...

Sorry, Bob. It was my joke first - you're too late. You didn't see it, well now you know why

BubbaTough · Post by **BubbaTough** » Sun Aug 31, 2008 10:38 pm

I think "psychological anchoring" is a valid concept...if 10 people study a parser and then are asked to write one similarities are much more likely. I have no position on "how similar" things will end up naturally.

I also know I would not have a function called SEE if I had not been influenced by others...even if I had developed the exact same concept I would have definitely picked another name. Nor would I use terms like alpha or beta (in fact I don't, I use minScore and maxScore I think). I think my first chess program also had a1 = 0 and h8 = 63 instead of the other way around, but I switched in the next program I wrote because everyone else seemed to do it different and it got confusing. Lots of little things like that can be influential in legitimate ways. I think Chris has perhaps been tempted into an overly strong way of stating certain things, that make some folks (including me sometimes) want to argue against everything he says even when valid.

-Sam

A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion

Re: A Simple Experiment for Advancing the Discussion