Fabien's open letter to the community

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Fabien's open letter to the community

Post by Laskos »

Adam Hair wrote:
I don't think there is any doubt Fruit has been a large influence by
anybody.

Average linkage between groups is more robust than complete linkage.

I have made two graphs, using Systat and your data, as well as Average
and Pearson. I changed the diagonal from 100% to 75%. That was the
cause of the difference in scale between your graphs and mine.

Image

As you can see, the graph is basically identical to yours. I just doing
this so that we both know that Systat and SPSS will produce the same
results.

For the second graph, I removed Houdini, Strelka, Ivanhoe, Rybka 4,
and Naum 4.2.

Image

I did this to point out, as you also did, that some care has to be given
to which engines are included. The clusters can change with the inclusion
and exclusion of engines. My belief is, in order to avoid bias as much as
possible, several versions from each engine family should be included.
And as many engine families as possible should be included. Then I
think the clustering analysis can give us a true picture.
Wow, your graphs are almost identical to mine, therefore we can cross-check our results. I will stick to "Average Linkage between Groups" and "Pearson correlation (bivariate correlations) measure", as these give me the more stable results over all the range, from main branches to individual engines.
I added Crafty 20.14 and some claimed to be related to it engines, and, as expected, they seem indeed related :). Now the Crafty family is represented as well. My Crafty test was a pain, with a very slow WB2UCI interface, it took me 5 hours to test Crafty at 100 ms.

Image

A and B are two groups of very unrelated engines. Group A might be related in some degree to Fruit. Group B seems not related at all to Fruit (if the Fruit hypothesis stands). Crafty, Fruit 2.1, and Rybka 3 families are named.

Kai
tmokonen
Posts: 1296
Joined: Sun Mar 12, 2006 6:46 pm
Location: Kelowna
Full name: Tony Mokonen

Re: Fabien's open letter to the community

Post by tmokonen »

Well, don't I have egg on my face. The original Strelka was version 1.0 beta. 1.8 was the first UCI version. Version 2.0 was the first version with source code.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob »

Osipov Jury wrote:About copying, rewriting of code and so on.

In eval-function of Ivanhoe we can find:

Code: Select all


  U = (POSITION->OccupiedBW >> 8) & wBitboardP;
  while (U)
    {
      b = BSF (U);
      MobValue -= PawnAntiMobility;
      BitClear (b, U);
    }

Robert Houdart replace this at:

Code: Select all


  U = (POSITION->OccupiedBW >> 8) & wBitboardP;
  MobValue -= popcnt(U) * PawnAntiMobility;

Is it plagiarism?
Students are quite good at this kind of obfuscation. One can do the above to improve speed (one popcnt vs N BSF instructions in a loop) or to make the code appear to be different (both do the same thing, but look significantly different, particularly if you change the variable names as well...

So it could be plagiarism, if this is the work of two different programmers. Or it could just be a simple derivative work where the second was just a better programmer or else paid more attention to speed. I still occasionally find faster ways to do things in Crafty, even though it is already a very fast program after years of optimizations.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob »

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
Don wrote:
Houdini wrote:
Don wrote:I think you are correct. I personally AM a fan of rewrites but I think there are many chess authors that don't ever rewrite, or have only done so once in years.
Rewriting for the sake of rewriting doesn't serve any purpose at all.
Software with a good architecture will survive many years and many changes.

Robert
I don't rewrite just for the sake of it.

Don
Typical reasons for rewriting:
1. Change of language (e.g. C to C++)
2. Change of underlying data structures (e.g. mailbox to bitboard)
3. Increased programming ability (every decade, my programming skills increase quite a bit, so the code I wrote 20 years ago will benefit considerably from a rewrite).
4. Lost code (yes, it does happen -- I have sent code back to chess authors who lost their original code and who had also sent their code to me on several occasions, for instance).
I don't agree with those.

(1) can be solved via translation. A good program can be translated to a new language without rewriting a thing...
Basic to C++?
Pascal to Java?
Fortran to C?
I have seen the latter two. And Zortech used to sell the Fortran to C translator. I probably still have a copy in my office. I used it to make the original translation of the CB source to C back in 1994. The code was not very clean looking, because it was just a one-for-one translation to C. And since FORTRAN didn't have pointers in the f77 standard, the resulting C code did not use pointers. So it needed a lot of cleaning up to become "decent C". But the output of the translator would compile and run and produce identical results to the original FORTRAN program.

I think there is even a GNU fortran to C translator. I have not used it, but have noticed it in installing several different Linux distros. Something like f2c or something. But no info on how well it works...


(2) I did the mailbox-to-bitboard translation with Crafty. It only affects a part of the code. Search is unchanged. Move ordering is unchanged. Several other things are completely board representation independent.
Crafty is a giant machine with nearly 40K lines of code. At ten lines per hour and $100/hour that would translate to $400,000 worth of work. It would be a titanic effort, therefore, to rewrite crafty. Smaller programs of a few hundred lines would be far more likely candidates for such an effort.
I can't speak for everyone, but I produce a lot more than 10 lines per hour. Don't forget, half of those 40K + lines of code are simply comments that don't need translation, and also help in understanding... I would suspect it would take a year of solid work to rewrite Crafty from scratch, assuming I could access the comments but not the source instructions, to get it back to something close to the current state. Might take longer to get the speed back to where it is...


(3) I don't think is that common. One does not rewrite _everything_ just because they are a better programmer now than 10 years ago. you might rewrite _parts_. But not the whole thing. That is a huge waste of time and effort.
I have done it, but typically with small projects. TSCP needs a complete rewrite, for instance.
(4) Never had that happen to the current version of code since I always keep duplicates and backups scattered around.
It is the result of carelessness when code is lost, but it does happen. I suspect that you have lost at least one or more snapshot of a released version of Cray Blitz or Crafty (at least in the early stages) because with hobby projects that are not producing revenue we sometimes are not quite as careful as when a big pile of money is at stake.
I lost about 80% of everything in 1995. Not the current version as it is always kept in multiple locations such as my office box, my laptop, and a central NFS file server on top of that. But old versions I kept only on my office box and a disk crash and later discovery that all backups were unreadable lost a lot of old stuff. I now have everything stored in three places, with a simple mechanism to scatter the duplicates to the right places after a change is made...
playjunior
Posts: 338
Joined: Fri Jun 22, 2007 12:53 am

Re: Fabien's open letter to the community

Post by playjunior »

Bob, a big difference between 1995 and now is that now you can encrypt, zip and email the source to yourself and it will take you 3 minutes. It is that simple.

I read long time ago when Strelka came out someone asked Vasik whether it might be possible that his code got stolen, he said no, because he keeps it a dedicated computer with no internet connection.

A person who takes such precautions would surely take a minute or two to take care he has a backup copy.
Sven
Posts: 4052
Joined: Thu May 15, 2008 9:57 pm
Location: Berlin, Germany
Full name: Sven Schüle

Re: Fabien's open letter to the community

Post by Sven »

playjunior wrote:Bob, a big difference between 1995 and now is that now you can encrypt, zip and email the source to

yourself and it will take you 3 minutes. It is that simple.

I read long time ago when Strelka came out someone asked Vasik whether it might be possible that his code got stolen, he said

no, because he keeps it a dedicated computer with no internet connection.

A person who takes such precautions would surely take a minute or two to take care he has a backup copy.
But not necessarily of each single version you create. And with some bad luck you miss to make that backup copy for a while.

Sven
playjunior
Posts: 338
Joined: Fri Jun 22, 2007 12:53 am

Re: Fabien's open letter to the community

Post by playjunior »

Sven Schüle wrote:
playjunior wrote:Bob, a big difference between 1995 and now is that now you can encrypt, zip and email the source to

yourself and it will take you 3 minutes. It is that simple.

I read long time ago when Strelka came out someone asked Vasik whether it might be possible that his code got stolen, he said

no, because he keeps it a dedicated computer with no internet connection.

A person who takes such precautions would surely take a minute or two to take care he has a backup copy.
But not necessarily of each single version you create. And with some bad luck you miss to make that backup copy for a while.

Sven
Neh. They had Rybka 3 beta available for some time, they were testing/tuning it, for what, months? Not a single backup copy in that timeframe? Who does that?

Edit: And then you make the release version and start selling it without bothering to take a backup.
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Fabien's open letter to the community

Post by Gian-Carlo Pascutto »

playjunior wrote: Neh. They had Rybka 3 beta available for some time, they were testing/tuning it, for what, months? Not a single backup copy in that timeframe? Who does that?
They might very well have the source for some of the betas. I think what was said was that the exact one that became the release wasn't saved. Sloppy, but by no means impossible.
playjunior
Posts: 338
Joined: Fri Jun 22, 2007 12:53 am

Re: Fabien's open letter to the community

Post by playjunior »

Gian-Carlo Pascutto wrote:
playjunior wrote: Neh. They had Rybka 3 beta available for some time, they were testing/tuning it, for what, months? Not a single backup copy in that timeframe? Who does that?
They might very well have the source for some of the betas. I think what was said was that the exact one that became the release wasn't saved. Sloppy, but by no means impossible.
I don't remember the story exactly, but weren't they saying they don't have anything resembling Rybka 3 source left? Can someone refresh us?

And wasn't it 'lost' long after the release?

I just refuse to believe something like this can happen. To Vas, who used a dedicated computer so that no one has access to code, who obfuscated search info so that the competitors could not guess what is going on under the hood. People who are so concerned about their code/security/whatever backup every day. Automated. And then do it by hand from time to time to be sure.

It just does not compute.
Gian-Carlo Pascutto
Posts: 1243
Joined: Sat Dec 13, 2008 7:00 pm

Re: Fabien's open letter to the community

Post by Gian-Carlo Pascutto »

playjunior wrote: People who are so concerned about their code/security/whatever backup every day. Automated. And then do it by hand from time to time to be sure.
I have to admit I have no argument against clairvoyance.