setjmp() - another one

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

chrisw wrote:
Zach Wegner wrote:
chrisw wrote:And some people don't like their evidence being challenged ;-)
Then why exactly did I so openly welcome Gerd's comments in my other thread??
It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?
There might be a few more. Any chess programmer who knows his stuff will say that it is rare.
To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.
It never was in the "beyond reasonable doubt" camp. You can't just take each point one by one in a case like this. Each point could taken as a coincidence, or of course, "use of ideas". It is exactly as Christophe said: If you take each line of code in isolation, then it is reasonable to assume that it could be original. But when these start adding up, that method just doesn't cut it.
Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?
I want a reasonable defense. You have to look at these things holistically. For each point that I posted, there might be a few other engines with the same details. But as you add each one up, you see that only two engines share all of these details.
setjmp() is not even a suspicious detail. It's history. Sorry.
Sorry, but you are wrong.
Perhaps the problem is that you just don't have the tools. You need both sets of source code. That you don't have. So you're trying to reconstitute Rybka source code by reversing the compiler process - that's made extra difficult, if not impossible, because lots of information gets thrown away at compile time. So you have to rely on interpretation, creativity and little bits of code, and claim setjmp() be rare and and and.

As Vas said in email today: "maybe at some point algorithms will be developed to quantify executable similarities. Until then, we'll probbaly just have to live with this sort of stuff".

How long will Vas have to live with this sort of stuff?

He has stated very clearly that Rybka is original work, has he not?
You need to complete that sentence however.

"He has stated very clearly that Rybka is original work, as has _every_ other person accused of cloning, until convincing proof was found."

That doesn't mean he did or didn't copy anything. It just means that most people that do something wrong tend to deny it until absolute proof is shown. Happens in the legal system all the time. There are always appeals and appeals being filed, even when a 6 year old could see the person was guilty.
chrisw

Re: setjmp() - another one

Post by chrisw »

bob wrote:
Your logic says this: You can't use fingerprints, because there is certainly a possibility that two people can have the same fingerprint, since there is not an infinite variety of ways for the loops and whorls to form. You can't use DNA because we _know_ there are a finite number of DNA protein combinations, and a finite number means duplicates _must_ be possible. You can't use photographic evidence because we know there are twins, doctored photos, etc. You can't use eyewitness testimony because humans make visual identification mistakes all the time
Hahaha!! Very funny Bob.

Don't even try and suggest you're using fingerprints, DNA or photographic evidence.

You have a couple of hammers and one rather ancient ducking stool. That's it.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
chrisw

Re: setjmp() - another one

Post by chrisw »

bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Bob,

If the best you can do is argue over the relative rarity or otherwise of setjmp(), frankly you should pack up now and go home.
Uri Blass
Posts: 10896
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: setjmp() - another one

Post by Uri Blass »

Tony wrote:
Uri Blass wrote:
Guetti wrote:
chrisw wrote:Bob argued that the existence in the Rybka code of setjmp() was "interesting" because this also existed in Fruit and nowhere else.

Uri pointed out that Tom Kerrigan's public program TSCP also used setjmp() and that some other programs were likely/possibly developed, legally, off TSCP as basis.

I'm an engine programmer and always had user interface programmers working in support, so I got very lazy and understand very little about DOS, windows, C support functions and so on. setjmp() knowledge is no way a speciality of mine.

However, casting my mind back many years, I'm fairly sure that the Ren Wu Chess program which was also worked on by Ren at Oxford Softworks used setjmp() to unwind the search on a timeout. CSTal, by contrast, did a proper search unwind.

There are two ways to exit Search() on a timeout or user intervention. The 'correct' way, I suppose, is to unwind the Search back to the start using unmove simultaneously unstacking the variables.
The brutal and simple way is simply to jump straight out, reseting the stack pointer. I guess this is setjmp().

I'l be very surprised if numbers of programs, especially those designed years ago without SMP in mind, didn't use the brutal setjmp() technique to break the search.

Bottom line: setjmp() is not unique and its use doesn't imply anything, certainly it cannot imply copied code.
Leaving the setjmp() relevant or not argument aside for a moment, come on, you state that you believe that an engine used setjmp()? Isn't that a bit vague? You demand always hard facts and source and pretty aligned code from Zach and Christoph, so were are the facts of Ren Wu chess?
Would you believe Zach if he would write he believes that the eval of Rybka is identical to Fruit without further comment?
You always want to see facts, so please before you do a conclusion, gives us some facts.

When I look at the (far from complete) list of chess engines released in recent years at http://wbec-ridderkerk.nl/html/enginesindex.htm, I wonder how many of these engines use setjmp()? 10 of 200? More?
I think it is still a good indication, not a prove.
I do not know but movei used setjmp() at the beginning like tscp
and I simply learned from the code of tscp.

I got rid of setjmp later because I read people said it is not good
so it is not clear how many engines use setjmp.
old movei use it
new movei does not use it and you can download both old movei and new movei from wbec site.

Uri
Let me get this clear.

You did not look up what setjmp() does. You saw it in TSCP, looked how it was used, and then used it the same way in Movei.

Now Uri, tell me.

Do you call that copying of ideas of copying code ?

Tony
This is not correct.
I looked at what setjmp() does

In your hypothetical situation it is copying code and not copying ideas but your hypotetical case did not happen.

Uri
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

I don't have any idea what you mean. I will stand by the statements I have made, whatever happens. If Rybka is proven to not be a copy, I will certainly accept that. But until I see a logical explanation for identical blocks of code, I can't imagine what that explanation might be.

My input so far: duplicate blocks of code does not occur naturally and by chance, in any software project that contains more than a few dozen lines of code, and even in those it is _very_ rare.

setjmp() is a very odd way of dealing with search termination. It has always been considered bad programming practice, it is difficult to understand, it is difficult to predict the side-effects it will cause, and it is used in only one program I have access to, namely fruit. It is rare. Using it does not mean it was copied, but it is just another piece of evidence that adds to the rest.

Those have been my issues from the beginning. I want to see whether this is true or not, because I have played in events with Rybka and would be no happier to determine it is a copy than I was for any of the crafty copies I personally found and exposed. Any more than the tour de France participants are happy to discover past and present competitors are guilty of obtaining unfair advantage by doping or HGH use.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

CThinker wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
I don't think there is anything suspicious with using setjmp(). I also use it in Thinker. I used to use try/catch, just because it frees up all the objects that I have created, but after I did away with dynamic object allocation, I settled for the lightweight setjmp.

I actually have a different view from that of Bob. I think that not using setjmp is the lousier way of coding something like a chess engine. After each call to Search/Research/QSearch, you need check for search termination. That is a lot of code. Way too much for my taste.
I'm not going to list all the known issues, but in classical programming and software engineering, returning is the proper way to handle this. If you have a global board state as many do, who is going to restore that? Or are you going to the trouble of copying the board state from ply to ply which is a horrible performance hit?


I just did a quick look of the Crafty code, and it tests 'abort_search' 8 times inside the Search() function. That's 8 identical code sprinkled around all over a single function.

Contrast that with the Thinker code where I only check once at the start of Search(), and then do a longjmp.

Crafty code (note how all that checking has really nothing to do with chess search logic, but now it makes it difficult to read the real logic).

Code: Select all

Search()
{
    if (terminate search) abort_search = true;
    if (do null move) {
        Search();
        if (abort_search) return 0;
    }
    if (do IID) {
        Search();
        if (abort_search) return 0;
        if (re-search) {
            Search();
            if (abort_search) return 0;
        }
    }
    ... // 5 more calls to Search() follows
}
Thinker code:

Code: Select all

Search()
{
    if (terminate search) longjmp();
    if (do null move) {
        Search();
    }
    if (do IID) {
        Search();
        if (re-search) {
            Search();
        }
    }
    ... // 5 more calls to Search() follows
}
You do realize that there is a huge difference between code that is shorter, and code that is easier to understand and debug, of course?
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

Graham Banks wrote:
Tony wrote: Let me get this clear.

You did not look up what setjmp() does. You saw it in TSCP, looked how it was used, and then used it the same way in Movei.

Now Uri, tell me.

Do you call that copying of ideas of copying code ?

Tony
Hmmm - so now you're after Uri as well. How many other engine authors are going to be brought before the kangaroo court?

As for private engines and their roots, how is anybody to know?
The point is, it doesn't matter. Copied is wrong. Are some going to get away with it. Certainly. Does that mean then that everybody should be given a free pass? I personally don't think so.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

chrisw wrote:
bob wrote:
Your logic says this: You can't use fingerprints, because there is certainly a possibility that two people can have the same fingerprint, since there is not an infinite variety of ways for the loops and whorls to form. You can't use DNA because we _know_ there are a finite number of DNA protein combinations, and a finite number means duplicates _must_ be possible. You can't use photographic evidence because we know there are twins, doctored photos, etc. You can't use eyewitness testimony because humans make visual identification mistakes all the time
Hahaha!! Very funny Bob.

Don't even try and suggest you're using fingerprints, DNA or photographic evidence.

You have a couple of hammers and one rather ancient ducking stool. That's it.
You see what you want to see. Right now I see several things that do not appear to be explainable by simple random chance. And each day, I see something new that falls into that same bucket. Perhaps we will see things removed from the bucket as they are logically explained. But not so far, at least.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: setjmp() - another one

Post by bob »

chrisw wrote:
bob wrote:
chrisw wrote:
bob wrote:
chrisw wrote:
Zach Wegner wrote:
chrisw wrote:Entirely agreed its no good solution, but that's not the point. You said it was rare/unique whatever. It isn't. And therefore it's another nail in the probably used the same code coffin. It isn't.
You don't think it is rare? What, five total engines, one which has since been removed, out of how many hundreds??

Bob sums it up perfectly: "I never claimed it to be the ultimate proof. I simply said it was another suspicious detail because it is such a lousy way of writing a program."

Some people will always refuse to accept evidence when it is presented.
And some people don't like their evidence being challenged ;-)

It's spurious to say 5 out of 500 because no exhaustive search has been done on the 500. It's 5 only for the unscientific reason that two programmers happened to remember some past detail.

As to how many in total? Difficult. Some by random because they were old engines, SMP wasn't a problem and jumping out of the Search is a lazy but effective method, if politically incorrect. Some because they read TSCP. Some not because they read Crafty. What does Gnu do?

To pillory a programmer and his work it is necessary to produce evidence that stands beyond all reasonable doubt. This use of setjmp() is no longer in the beyond all reasonable doubt camp. Sorry about that, but that's the way it can go in adversarial investigation processes like this one.

Each point you bring will be challenged. Some will fall and some will stand, presumably. Do you want it any other way? Just blind acceptance because you're getting frustrated otherwise? If it were you, Zach, you'ld expect a vigorous defence and counter to each and every allegation and piece of evidence. Would you not?

setjmp() is not even a suspicious detail. It's history. Sorry.
It is suspicious to those of us that understand software development. Plagiarism involves copying good pieces of code most often, but when it is about copying a bad piece of code, it is just stronger evidence that copying was done.
Six users of setjmp() now, another "bad" programmer using "bad" techniques crawled out of the woodwork this afternoon.

Perhaps there are several ways to skin a cat, Bob? Your way isn't necessarily the "unique", "real programmer" and most "interesting" one? ;-)

Could it be so?
I missed #6. Movei was copied from TSCP so it does not count. It appears that Rybka/Strelka were copied from fruit. So that is two legitimate users. Which program did I miss? I did post earlier that gnu does not use setjmp. I also checked arasan, glaurung and a couple of others I have on my box. Nothing. I use fruit 2.1 on my cluster testing, it has it as we have heard and I verified it as well.

You mentioned a program you "thought" used it, but I have been counting concrete cases. And so far, there appear to be two legitimate users of setjmp()/longjmp(), fruit and TSCP.

When the total number reaches say 10, out of probably at least 500 chess engines, then that will be 1 in 50 which is still rare, but becoming "less" rare. But we are nowhere near that yet.
Bob,

If the best you can do is argue over the relative rarity or otherwise of setjmp(), frankly you should pack up now and go home.
I didn't start the thread. I said setjmp() is a rarely used programming construct, and it is. Then everyone starts going round and round the mulberry bush chasing their tails.

I just noted that I saw the suspect on the same street, at about the same time as the crime was committed. I didn't even begin to suggest that was enough evidence to connect him to the murder. But now we can at least place him around the crime scene. Other evidence is needed to complete the case. I'm waiting until I see more. But every tiny clue is still important. If something happens in only 3 out of 500 (or even 20 out of 500) programs, we have just taken that "duplicate code written by chance" and cut the odds of that significantly...