Dirt wrote:Milos wrote:This is all very fine but where you draw the line for copyright?
If I take Knuth's code and translate it to C (and change variable names in the process) do I then hold a copyright? Yes, no? What if change change data types so I compare pointers instead of numbers. Is that enough to claim my own copyright? What if I swap both key and value instead of just value? Is it now sufficient? How can anyone tell when it is?
If original code contained a bug (for example doesn't make swaps in last iteration), and my also contains it, but I made all the changes (translation, changing data types, changing swapping, changing sort direction, changing sort index bounds) does this invalidates my copyright?
You think a court judge would have consistent opinion on that?
Personally, I think the bubble sort is too short for copyright of the object code. It's been implemented thousands of times, and most simple variations have all been tried. In other words, I'd put in the public domain unless there was something very unusual about it. In the source code, though, if you used non-simple variable names it might be copyrightable.
Every byte of Crafty is in the public domain, but not every sequence of bytes. For instance, I'll bet there is an 'A' in Crafty but I can use that snippet all I like: AAA. No violation. At some point the expression of an idea, whether in source, object or general plan, becomes so unlikely to think of independently that it becomes copyrightable. Borderline cases take a judge to decide, and he will listen to experts. At some point it's a crapshoot, though.
Also, the "type" of bug is important.
In the Crafty/Rybka case, two specific bugs have been mentioned.
1. In the Edwards tablebase format, for the kp vs kp database (we did not have any 5 piece files with 2 pawns or more) he did not consider en passant captures when he generated them. If you probed in a position where an ep capture is either possible or will be possible in the future, you would get a wrong answer. I added code in Crafty to check for (a) pawns are on adjacent files, else no EP capture is possible; (b) pawns have not "passed each other"; (c) one of the two pawns is on its original square so it can advance two squares at once to create the EP possibility. If all of those were true, the EGTB probes into kpkp were disabled. Eugene came along much later and did this correctly. In fact, his EGTBs pre-dated Rybka by several years. And his handled EP captures properly so the tests above were not needed. At the point we finished testing his code and settled on the compression blocksize, I replaced the Edwards tables with his and the Edwards tables were no longer available from my ftp site. Later, along came Rybka, supposedly written from scratch, and initially using the Nalimov EGTBs since they were all that were available, and Rybka 1.6.1 clearly has egtb.cpp included. But it also has the code above to handle EP in Edwards format. How did it get there? Why would he write that on his own when that tablebase format was long since discarded and _everybody_ used either Nalimov or Thompson, or else wrote their own (Rybka 1.6.1 used Nalimov). Probability of that happening? zero.
2. I have always had a function in Crafty, modelled after the Slate/Atkin "clean-up" idea that forces mate when pawns are gone and one side has mating material. Simple idea is to drive the king to the edge of the board and then let the search find the forced mate. And over time, I made it more "inclusive" so that it would work in some unexpected cases. Which had exceptions. Originally, I always called this function and let it make the decision "can I handle this position or not?" If not, it returned 99999, Evaluate() realized that this meant that EvaluateMate() was refusing to handle this position for some reason, and it did a normal eval. Later, to avoid the procedure call overhead, I moved the test up to where I called EvaluateMate() so that I didn't call it if it didn't apply. But the check for the 99999 return was left in Evaluate(). Since EvaluateMate() never returned 99999, that branch was predicted 100% of the time and it didn't hurt, but it was certainly dead code. Which also appears at the _same_ point in Rybka's Evaluate(). What is the probability of Rybka having an EvaluateMate() that is _identical_ to Crafty's, and then having that "ms = EvaluateMate(); if (ms == 99999) break;" code at the point where it was called? When there is _zero_ code in EvaluateMate() that can return 99999 (or any other large odd number, returning 99998 would actually break the program completely since in Crafty +infinity = 32767... The probability of that being "original code" is zero.
So it is not the presence of "any bug" in two implementations that is the red flag. It is the presence of the same bug in both, where the first occurrence can be explained by the author (me) and it is clear that nobody else would have written exactly the same code, with the same return values, and then tested for a return value that was impossible for the code to produce...
There are others in our report...