Fabien's open letter to the community

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Harvey Williamson, bob

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
Gian-Carlo Pascutto
Posts: 1101
Joined: Sat Dec 13, 2008 6:00 pm
Contact:

Re: Fabien's open letter to the community

Post by Gian-Carlo Pascutto » Mon Jan 31, 2011 9:50 am

alpha123 wrote: I'm quite surprised he doesn't use some form of version control. Even I do for some things, and I'm 14....
Sjeng existed somewhere starting from 1998-1999, and didn't get put under version control until just after DS 2.5, which was apparently near the end of 2007. So I'm not surprised some other programmers don't (yet) use it :) There is no pressing need for single-developer projects, and I hardly found real benefits in using it until I started to use git (was subversion before).

It's easy to lose important data even when it's supposedly backed up. I lost my old email archive once, which was and still is very painful. And I know Bob lost the old crafty sources for many years before they turned up on some ancient tape. (IIRC, Bob can probably correct the story if he wants)

So I have no problem believing that Vas really lost the exact source used for Rybka 3.

tmokonen
Posts: 924
Joined: Sun Mar 12, 2006 5:46 pm
Location: Vancouver

Re: Fabien's open letter to the community

Post by tmokonen » Mon Jan 31, 2011 10:20 am

The source for later versions of Strelka have been published. The source for the original Winboard version of Strelka, 1.8, has never been published.

bob
Posts: 20355
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob » Mon Jan 31, 2011 4:05 pm

mhull wrote:
Graham Banks wrote:To me, an FSF decision would be good enough to give finality, which I was really hoping was what we all wanted.
But all you needed was an accusation against non-rybka products. But for the Rybka product itself, an accusation isn't good enough, even if its from Fabien. It is your apparent double standard that drives people up the wall and generates so much heat (and entire alternate chess forums). People don't like unfair standards, especially from people who put themselves in a position of objectivity, such as running a rating list or being a moderator.

People are wrapping their heads in duct tape trying to keep it from exploding every time you say something like this.
Hey! that is _my_ line, if I recall. :) (Borrowed from talk show host Glenn Beck)

bob
Posts: 20355
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob » Mon Jan 31, 2011 4:10 pm

Gian-Carlo Pascutto wrote:
alpha123 wrote: I'm quite surprised he doesn't use some form of version control. Even I do for some things, and I'm 14....
Sjeng existed somewhere starting from 1998-1999, and didn't get put under version control until just after DS 2.5, which was apparently near the end of 2007. So I'm not surprised some other programmers don't (yet) use it :) There is no pressing need for single-developer projects, and I hardly found real benefits in using it until I started to use git (was subversion before).

It's easy to lose important data even when it's supposedly backed up. I lost my old email archive once, which was and still is very painful. And I know Bob lost the old crafty sources for many years before they turned up on some ancient tape. (IIRC, Bob can probably correct the story if he wants)

So I have no problem believing that Vas really lost the exact source used for Rybka 3.
Most never turned up. Several different people had apparently kept an old version here and there. And they emailed them to me. But at least 90% of old versions were lost. Including the last 3-4 years of Cray Blitz source that was on the same machine. I did find a 1990 or so copy on an old tape.

I really missed all the old log files that we lost from every tournament we played in starting in 1976. Very painful to see a lot of history disappear like that for us...

bob
Posts: 20355
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob » Mon Jan 31, 2011 4:11 pm

Milos wrote:
michiguel wrote:The claim I saw once (indirectly from another person, not from the horse's mouth) was that he lost version 3.00.

Accidents happened and it is possible to lose one "specific" version. Particularly if only one guy works on the code.
Ok, an IQ/objectivity test for you:
How strongly do you believe this was the case with Rybka 3 source on the scale from 0 to 100%?
I'm just asking for your sincere personal opinion. Even though refusing to answer the question is an option, doing so would only mean you are too embarrassed to admit.
I would say certainly possible, but not very likely. Hits too close to home to say impossible, however.

User avatar
Laskos
Posts: 8244
Joined: Wed Jul 26, 2006 8:21 pm

Re: Fabien's open letter to the community

Post by Laskos » Mon Jan 31, 2011 11:44 pm

Adam Hair wrote:
I don't think there is any doubt Fruit has been a large influence by
anybody.

Average linkage between groups is more robust than complete linkage.

I have made two graphs, using Systat and your data, as well as Average
and Pearson. I changed the diagonal from 100% to 75%. That was the
cause of the difference in scale between your graphs and mine.

Image

As you can see, the graph is basically identical to yours. I just doing
this so that we both know that Systat and SPSS will produce the same
results.

For the second graph, I removed Houdini, Strelka, Ivanhoe, Rybka 4,
and Naum 4.2.

Image

I did this to point out, as you also did, that some care has to be given
to which engines are included. The clusters can change with the inclusion
and exclusion of engines. My belief is, in order to avoid bias as much as
possible, several versions from each engine family should be included.
And as many engine families as possible should be included. Then I
think the clustering analysis can give us a true picture.
Wow, your graphs are almost identical to mine, therefore we can cross-check our results. I will stick to "Average Linkage between Groups" and "Pearson correlation (bivariate correlations) measure", as these give me the more stable results over all the range, from main branches to individual engines.
I added Crafty 20.14 and some claimed to be related to it engines, and, as expected, they seem indeed related :). Now the Crafty family is represented as well. My Crafty test was a pain, with a very slow WB2UCI interface, it took me 5 hours to test Crafty at 100 ms.

Image

A and B are two groups of very unrelated engines. Group A might be related in some degree to Fruit. Group B seems not related at all to Fruit (if the Fruit hypothesis stands). Crafty, Fruit 2.1, and Rybka 3 families are named.

Kai

tmokonen
Posts: 924
Joined: Sun Mar 12, 2006 5:46 pm
Location: Vancouver

Re: Fabien's open letter to the community

Post by tmokonen » Tue Feb 01, 2011 1:25 am

Well, don't I have egg on my face. The original Strelka was version 1.0 beta. 1.8 was the first UCI version. Version 2.0 was the first version with source code.

bob
Posts: 20355
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob » Tue Feb 01, 2011 1:27 am

Osipov Jury wrote:About copying, rewriting of code and so on.

In eval-function of Ivanhoe we can find:

Code: Select all


  U = (POSITION->OccupiedBW >> 8) & wBitboardP;
  while (U)
    {
      b = BSF (U);
      MobValue -= PawnAntiMobility;
      BitClear (b, U);
    }

Robert Houdart replace this at:

Code: Select all


  U = (POSITION->OccupiedBW >> 8) & wBitboardP;
  MobValue -= popcnt(U) * PawnAntiMobility;

Is it plagiarism?
Students are quite good at this kind of obfuscation. One can do the above to improve speed (one popcnt vs N BSF instructions in a loop) or to make the code appear to be different (both do the same thing, but look significantly different, particularly if you change the variable names as well...

So it could be plagiarism, if this is the work of two different programmers. Or it could just be a simple derivative work where the second was just a better programmer or else paid more attention to speed. I still occasionally find faster ways to do things in Crafty, even though it is already a very fast program after years of optimizations.

bob
Posts: 20355
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Fabien's open letter to the community

Post by bob » Tue Feb 01, 2011 1:48 am

Dann Corbit wrote:
bob wrote:
Dann Corbit wrote:
Don wrote:
Houdini wrote:
Don wrote:I think you are correct. I personally AM a fan of rewrites but I think there are many chess authors that don't ever rewrite, or have only done so once in years.
Rewriting for the sake of rewriting doesn't serve any purpose at all.
Software with a good architecture will survive many years and many changes.

Robert
I don't rewrite just for the sake of it.

Don
Typical reasons for rewriting:
1. Change of language (e.g. C to C++)
2. Change of underlying data structures (e.g. mailbox to bitboard)
3. Increased programming ability (every decade, my programming skills increase quite a bit, so the code I wrote 20 years ago will benefit considerably from a rewrite).
4. Lost code (yes, it does happen -- I have sent code back to chess authors who lost their original code and who had also sent their code to me on several occasions, for instance).
I don't agree with those.

(1) can be solved via translation. A good program can be translated to a new language without rewriting a thing...
Basic to C++?
Pascal to Java?
Fortran to C?
I have seen the latter two. And Zortech used to sell the Fortran to C translator. I probably still have a copy in my office. I used it to make the original translation of the CB source to C back in 1994. The code was not very clean looking, because it was just a one-for-one translation to C. And since FORTRAN didn't have pointers in the f77 standard, the resulting C code did not use pointers. So it needed a lot of cleaning up to become "decent C". But the output of the translator would compile and run and produce identical results to the original FORTRAN program.

I think there is even a GNU fortran to C translator. I have not used it, but have noticed it in installing several different Linux distros. Something like f2c or something. But no info on how well it works...


(2) I did the mailbox-to-bitboard translation with Crafty. It only affects a part of the code. Search is unchanged. Move ordering is unchanged. Several other things are completely board representation independent.
Crafty is a giant machine with nearly 40K lines of code. At ten lines per hour and $100/hour that would translate to $400,000 worth of work. It would be a titanic effort, therefore, to rewrite crafty. Smaller programs of a few hundred lines would be far more likely candidates for such an effort.
I can't speak for everyone, but I produce a lot more than 10 lines per hour. Don't forget, half of those 40K + lines of code are simply comments that don't need translation, and also help in understanding... I would suspect it would take a year of solid work to rewrite Crafty from scratch, assuming I could access the comments but not the source instructions, to get it back to something close to the current state. Might take longer to get the speed back to where it is...


(3) I don't think is that common. One does not rewrite _everything_ just because they are a better programmer now than 10 years ago. you might rewrite _parts_. But not the whole thing. That is a huge waste of time and effort.
I have done it, but typically with small projects. TSCP needs a complete rewrite, for instance.
(4) Never had that happen to the current version of code since I always keep duplicates and backups scattered around.
It is the result of carelessness when code is lost, but it does happen. I suspect that you have lost at least one or more snapshot of a released version of Cray Blitz or Crafty (at least in the early stages) because with hobby projects that are not producing revenue we sometimes are not quite as careful as when a big pile of money is at stake.
I lost about 80% of everything in 1995. Not the current version as it is always kept in multiple locations such as my office box, my laptop, and a central NFS file server on top of that. But old versions I kept only on my office box and a disk crash and later discovery that all backups were unreadable lost a lot of old stuff. I now have everything stored in three places, with a simple mechanism to scatter the duplicates to the right places after a change is made...

playjunior
Posts: 338
Joined: Thu Jun 21, 2007 10:53 pm

Re: Fabien's open letter to the community

Post by playjunior » Tue Feb 01, 2011 8:13 am

Bob, a big difference between 1995 and now is that now you can encrypt, zip and email the source to yourself and it will take you 3 minutes. It is that simple.

I read long time ago when Strelka came out someone asked Vasik whether it might be possible that his code got stolen, he said no, because he keeps it a dedicated computer with no internet connection.

A person who takes such precautions would surely take a minute or two to take care he has a backup copy.

Post Reply