setjmp() - another one

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

tmokonen
Posts: 1362
Joined: Sun Mar 12, 2006 6:46 pm
Location: Kelowna
Full name: Tony Mokonen

Re: setjmp() - another one

Post by tmokonen »

bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Lance Perkins has stated in this thread that he uses it in Thinker. Some other chess programs that use setjmp include BCE, Witz, Rattate Chess Bologna, RomiChess, Hanzo The Razor, and the funny little concept program Toledo. That's 9 so far :) I would suspect there's other closed source programs that use it, too, but who has the time to disassemble all of them and find out?

I once looked at TSCP, noticed this funny setjmp call, looked it up in my old tattered copy of K&R, and tried it out for myself in my weak little program Tony's Chess. I found it wanting and removed it later. Somehow, I don't feel guilty about it, and I wouldn't refer to myself as a "cut and paster" just because I saw setjmp in use in another program and wanted to try it out for myself.

The use of setjmp might be a small piece in the large "Rybka, is it or isn't it?" puzzle, but in and of itself I cannot see how the use of a standard library function is suspicious.
chrisw

Re: setjmp() - another one

Post by chrisw »

bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:And some people don't like their evidence being challenged ;-)
Then why exactly did I so openly welcome Gerd's comments in my other thread??
It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?
There might be a few more. Any chess programmer who knows his stuff will say that it is rare.
To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.
It never was in the "beyond reasonable doubt" camp. You can't just take each point one by one in a case like this. Each point could taken as a coincidence, or of course, "use of ideas". It is exactly as Christophe said: If you take each line of code in isolation, then it is reasonable to assume that it could be original. But when these start adding up, that method just doesn't cut it.
Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?
I want a reasonable defense. You have to look at these things holistically. For each point that I posted, there might be a few other engines with the same details. But as you add each one up, you see that only two engines share all of these details.
setjmp() is not even a suspicious detail. It's history. Sorry.
Sorry, but you are wrong.
Perhaps the problem is that you just don't have the tools. You need both sets of source code. That you don't have. So you're trying to reconstitute Rybka source code by reversing the compiler process - that's made extra difficult, if not impossible, because lots of information gets thrown away at compile time. So you have to rely on interpretation, creativity and little bits of code, and claim setjmp() be rare and and and.

As Vas said in email today: "maybe at some point algorithms will be developed to quantify executable similarities. Until then, we'll probbaly just have to live with this sort of stuff".

How long will Vas have to live with this sort of stuff?

He has stated very clearly that Rybka is original work, has he not?
You need to complete that sentence however.

"He has stated very clearly that Rybka is original work, as has _every_ other person accused of cloning, until convincing proof was found."

That doesn't mean he did or didn't copy anything. It just means that most people that do something wrong tend to deny it until absolute proof is shown. Happens in the legal system all the time. There are always appeals and appeals being filed, even when a 6 year old could see the person was guilty.
Yes, and people who do right also tend to deny that they did wrong also.

So your point is what exactly? He can't win whichever way?

Sounds like the poor witch again. Doesn't it?
chrisw

Re: setjmp() - another one

Post by chrisw »

tmokonen wrote:
bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Lance Perkins has stated in this thread that he uses it in Thinker. Some other chess programs that use setjmp include BCE, Witz, Rattate Chess Bologna, RomiChess, Hanzo The Razor, and the funny little concept program Toledo. That's 9 so far :) I would suspect there's other closed source programs that use it, too, but who has the time to disassemble all of them and find out?

I once looked at TSCP, noticed this funny setjmp call, looked it up in my old tattered copy of K&R, and tried it out for myself in my weak little program Tony's Chess. I found it wanting and removed it later. Somehow, I don't feel guilty about it, and I wouldn't refer to myself as a "cut and paster" just because I saw setjmp in use in another program and wanted to try it out for myself.

The use of setjmp might be a small piece in the large "Rybka, is it or isn't it?" puzzle, but in and of itself I cannot see how the use of a standard library function is suspicious.
Suspicious? Try this ..
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:And some people don't like their evidence being challenged ;-)
Then why exactly did I so openly welcome Gerd's comments in my other thread??
It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?
There might be a few more. Any chess programmer who knows his stuff will say that it is rare.
To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.
It never was in the "beyond reasonable doubt" camp. You can't just take each point one by one in a case like this. Each point could taken as a coincidence, or of course, "use of ideas". It is exactly as Christophe said: If you take each line of code in isolation, then it is reasonable to assume that it could be original. But when these start adding up, that method just doesn't cut it.
Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?
I want a reasonable defense. You have to look at these things holistically. For each point that I posted, there might be a few other engines with the same details. But as you add each one up, you see that only two engines share all of these details.
setjmp() is not even a suspicious detail. It's history. Sorry.
Sorry, but you are wrong.
Perhaps the problem is that you just don't have the tools. You need both sets of source code. That you don't have. So you're trying to reconstitute Rybka source code by reversing the compiler process - that's made extra difficult, if not impossible, because lots of information gets thrown away at compile time. So you have to rely on interpretation, creativity and little bits of code, and claim setjmp() be rare and and and.

As Vas said in email today: "maybe at some point algorithms will be developed to quantify executable similarities. Until then, we'll probbaly just have to live with this sort of stuff".

How long will Vas have to live with this sort of stuff?

He has stated very clearly that Rybka is original work, has he not?
You need to complete that sentence however.

"He has stated very clearly that Rybka is original work, as has _every_ other person accused of cloning, until convincing proof was found."

That doesn't mean he did or didn't copy anything. It just means that most people that do something wrong tend to deny it until absolute proof is shown. Happens in the legal system all the time. There are always appeals and appeals being filed, even when a 6 year old could see the person was guilty.
Yes, and people who do right also tend to deny that they did wrong also.

So your point is what exactly? He can't win whichever way?

Sounds like the poor witch again. Doesn't it?
No, it just shows that your statement above has absolutely no purpose in the current discussion, since if he is innocent, he will proclaim so, and if he is guilty, he will still proclaim innocence. So we get nothing from the statement no matter what, which makes it useless. The old "everything I say is false" circular argument that says nothing at all, as an example.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: setjmp() - another one

Post by michiguel »

bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.
To Bob:
According to your calculations, 2 uses setjmp() and 5 don't. In another message Thinker author declares it uses it, so 3 vs. 5. Why are you saying 3 out of hundreds? you did check only a handful but concluded that all the other hundreds don't have it. I find that very misleading and tricky.

To everybody:
I really do not care about one side or another and I am drawn to read some of this out of curiosity, but I find some that some of the "debate tactics" used in this discussion belong to a political campaign rather than a technical forum. I cannot believe several things that I'm reading. I guess I will try to stop reading so I will not poison my very nice opinion about many smart people of this forum. It hurts!

IMHO, everybody should stop until more data is provided, for the sake of saving some respect. Everybody made their point already with what it was shown.

Miguel

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

tmokonen wrote:
bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Lance Perkins has stated in this thread that he uses it in Thinker. Some other chess programs that use setjmp include BCE, Witz, Rattate Chess Bologna, RomiChess, Hanzo The Razor, and the funny little concept program Toledo. That's 9 so far :) I would suspect there's other closed source programs that use it, too, but who has the time to disassemble all of them and find out?

I once looked at TSCP, noticed this funny setjmp call, looked it up in my old tattered copy of K&R, and tried it out for myself in my weak little program Tony's Chess. I found it wanting and removed it later. Somehow, I don't feel guilty about it, and I wouldn't refer to myself as a "cut and paster" just because I saw setjmp in use in another program and wanted to try it out for myself.

The use of setjmp might be a small piece in the large "Rybka, is it or isn't it?" puzzle, but in and of itself I cannot see how the use of a standard library function is suspicious.
Here's the reasoning in a nutshell:

(1) we have seen several blocks of code that are identical between program A and program B.

(2) we have a large body of other programs as well. C, D, ..., numbering 500 or so.

(3) Someone wants to say that there is a good chance that A and B were written independently yet they have identical blocks of code, that could be pure chance.

(4) I have said that claim is nonsense. There has been lots of research and discussion about software plagiarism over the years. There are semantic analysis tools designed to catch duplicate programs which have had variable names, comments, etc altered to disguise the plagiarism.

(5) now we find a rarely used construct in the original and what is suspected of being a copy. So that argument (3) above becomes far weaker, since (a) it is unlikely that duplicate code would be written; (b) the duplicate code contains a function that is rarely used because of potential side-effects and difficult-to-find bugs. And that strengthens argument (4) since the probability of duplicate code is low and is multiplied by the probability of using a rarely used approach, to even further reduce the probability that the duplication was by accident.

That was the point, and the only point.

Some are trying the old legal cross-examination trick all attorneys use. I have an attorney friend, and I have been called as an expert witness twice in the past 20 years. The cross-examination goes something like this: I will give the defense question, my answer as limited by the attorney, and in parentheses what I would have said could I have said more.

Note this is hypothetical, not real, but illustrates the point:

Someone was robbed at gunpoint and beaten. I saw them open the front door of the victim's house as I walked by, but not knowing a crime was in progress, I kept going.

defense (I had already testified that I saw the person open the front door.) "did you see my client rob or beat the victim, please answer yes or no: me: "no" (but I did see him enter the house right before the attack happened.) Defense: "did you see my client inside the victim's house, yes or no?" me: "no" (but I saw him entering the house, but I did not stick around to see him inside, he was opening the door as I passed by.")

You get the idea? To try to discredit what I said, not by direct discredit, but by having me re-testify again, but only answering yes/no questions that make the defendant look less guilty and hope that the jury remembers my last testimony rather than my original testimony.

That is what is happening here. We are arguing about a single piece of evidence of modest weight, ad nauseum, where some hope to have it dismissed completely because by itself, it is nowhere near enough to produce a conclusion. And they are trying to do this for each piece of evidence as it is presented. Where a jury has to look at the entire body of evidence, and each small piece adds to the puzzle, and enough small pieces lead to a reasonable conclusion.

But we can't get past the single piece of information issue here... You smell smoke in your house. Someone must be smoking, that doesn't mean anything. Seems a little warm? someone must have turned up the A/C, or the outside temp is unusually high, doesn't mean anything. Hear a crackling? Kids must be in the kitchen eating chips, doesn't mean anything. Hear a siren approaching? Someone must have gotten hurt down the street, doesn't mean anything.

Dude, your house is on fire. All the evidence together would lead to that conclusion. If you look at all of it. If you dismiss each piece as it is noticed, you are going to get charbroiled.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

michiguel wrote:
bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.
To Bob:
According to your calculations, 2 uses setjmp() and 5 don't. In another message Thinker author declares it uses it, so 3 vs. 5. Why are you saying 3 out of hundreds? you did check only a handful but concluded that all the other hundreds don't have it. I find that very misleading and tricky.
It was not intended to be "tricky" You can find long discussions on setjmp()/longjmp() on the network. I have looked at a few others as well, some are executable-only, some are not. I have 20 total programs to play with. Out of those 20, I have exactly 1 with setjmp() in it. NOt all are in source. if even 1/20 holds up, then the probability for accidentally duplicating blocks of code, where both have a 1/20 usage feature in them, becomes even smaller.

That was the only point I tried to make. An unusual programming construct simply further reduces the already impossibly low odds of producing duplicate code independently.



To everybody:
I really do not care about one side or another and I am drawn to read some of this out of curiosity, but I find some that some of the "debate tactics" used in this discussion belong to a political campaign rather than a technical forum. I cannot believe several things that I'm reading. I guess I will try to stop reading so I will not poison my very nice opinion about many smart people of this forum. It hurts!

IMHO, everybody should stop until more data is provided, for the sake of saving some respect. Everybody made their point already with what it was shown.

Miguel

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
User avatar
Zach Wegner
Posts: 1922
Joined: Thu Mar 09, 2006 12:51 am
Location: Earth

Re: setjmp() - another one

Post by Zach Wegner »

I forgot to respond to this.
chrisw wrote:As to Vas responding to your questions. Well you banned me from sending them to him. Am I unbanned now? Gimme question list and I'll fire it off.
By no means are you banned, I only asked that Vas answer directly on this forum or his. There's no reason to have a "middle man" to filter these answers through.

An initial questions list has already been provided in the thread "Questions for Vas".
chrisw

Re: setjmp() - another one

Post by chrisw »

bob wrote:
tmokonen wrote:
bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Lance Perkins has stated in this thread that he uses it in Thinker. Some other chess programs that use setjmp include BCE, Witz, Rattate Chess Bologna, RomiChess, Hanzo The Razor, and the funny little concept program Toledo. That's 9 so far :) I would suspect there's other closed source programs that use it, too, but who has the time to disassemble all of them and find out?

I once looked at TSCP, noticed this funny setjmp call, looked it up in my old tattered copy of K&R, and tried it out for myself in my weak little program Tony's Chess. I found it wanting and removed it later. Somehow, I don't feel guilty about it, and I wouldn't refer to myself as a "cut and paster" just because I saw setjmp in use in another program and wanted to try it out for myself.

The use of setjmp might be a small piece in the large "Rybka, is it or isn't it?" puzzle, but in and of itself I cannot see how the use of a standard library function is suspicious.
Here's the reasoning in a nutshell:

(1) we have seen several blocks of code that are identical between program A and program B.

(2) we have a large body of other programs as well. C, D, ..., numbering 500 or so.

(3) Someone wants to say that there is a good chance that A and B were written independently yet they have identical blocks of code, that could be pure chance.

(4) I have said that claim is nonsense. There has been lots of research and discussion about software plagiarism over the years. There are semantic analysis tools designed to catch duplicate programs which have had variable names, comments, etc altered to disguise the plagiarism.

(5) now we find a rarely used construct in the original and what is suspected of being a copy. So that argument (3) above becomes far weaker, since (a) it is unlikely that duplicate code would be written; (b) the duplicate code contains a function that is rarely used because of potential side-effects and difficult-to-find bugs. And that strengthens argument (4) since the probability of duplicate code is low and is multiplied by the probability of using a rarely used approach, to even further reduce the probability that the duplication was by accident.

That was the point, and the only point.

Some are trying the old legal cross-examination trick all attorneys use. I have an attorney friend, and I have been called as an expert witness twice in the past 20 years. The cross-examination goes something like this: I will give the defense question, my answer as limited by the attorney, and in parentheses what I would have said could I have said more.

Note this is hypothetical, not real, but illustrates the point:

Someone was robbed at gunpoint and beaten. I saw them open the front door of the victim's house as I walked by, but not knowing a crime was in progress, I kept going.
Ahem. Excuse me. But Chief Technical Investigator Zach says:
where did anyone say that we are so sure that [we] have a watertight case against Vas

When did I say anything in that post about how strong my case was?

I haven't seen what I would consider proof.
How does that match up with your ...
(1) we have seen several blocks of code that are identical between program A and program B.

(2) we have a large body of other programs as well. C, D, ..., numbering 500 or so.

(3) Someone wants to say that there is a good chance that A and B were written independently yet they have identical blocks of code, that could be pure chance.
or even with
Someone was robbed at gunpoint and beaten.
Hmmm?

Time to pack up and go home, Bob?
CThinker
Posts: 388
Joined: Wed Mar 08, 2006 10:08 pm

Re: setjmp() - another one

Post by CThinker »

bob wrote: I'm not going to list all the known issues, but in classical programming and software engineering, returning is the proper way to handle this. If you have a global board state as many do, who is going to restore that? Or are you going to the trouble of copying the board state from ply to ply which is a horrible performance hit?
In modern day programming, when you encouter an unexpected condition, you "throw" an exception. For a chess engine, running out of time or receiving a user input is just an unexpected condition. Whatever it is in the call stack is not useful anymore. Your last result is what you will give out.

And no, why would anyone copy board state from ply to ply? How about this:

Code: Select all

//C++
Iterate(TBoard &Board)
{
    TBoard SavedBoard = Board;
    try {
        for (int depth=1; depth<MaxDepth; depth++)
            SearchRoot (Board, depth, ...);
    }
    catch() {
    }
    Board = SavedBoard;
}
// C
Iterate(TBoard *pBoard)
{
    TBoard SavedBoard = *pBoard;
    if (setjmp(jumpbuffer)==0) {
        for (int depth=1; depth<MaxDepth; depth++)
            SearchRoot (pBoard, depth, ...);
    }
    *pBoard = SavedBoard;
}
You still see a lot of code out there where if there are 10 levels of calls, each one checks for the error condition. So, if the 10th level fails to allocate memory, each returning call will check that and eventually the root call notes the error.

That's just too archaic for me. Why should half of the code be just for error checking?
bob wrote: You do realize that there is a huge difference between code that is shorter, and code that is easier to understand and debug, of course?
Yes I do. And in that example of Crafty vs Thinker code, the longer Crafty code is actually harder to debug and harder to understad (all those checks that has nothing to do with the real logic gets in the way). What if you added a new call to Search(), and then you forgot to add the termintion check?