Uri Blass wrote:bob wrote:Uri Blass wrote:bob wrote:Uri Blass wrote:bob wrote:geots wrote:Dirt wrote:geots wrote:Bob
He's not really a primary player in this at this point. It should be left to Zach and the others who are gathering evidence to present it.
After the stuff i read that he said today, i was shocked. I want to see what he says after a one on one with Vas- sort of a Rybka- Crafty matchup, so to speak.
What have I said that is shocking? That you will not "accidentally" produce a couple of hundred lines of identical code here, a couple of hundred lines there? That is all I have stated from the beginning. _IF_ there is duplicate code to any significant extent (not single lines, but blocks of code) then there is no way it is an "accident". If that is shocking, not much I can say. Anybody that deals with large numbers of programming assignments on a regular basis will say the same.
This is no accident but also proves nothing.
No accident because programmers do not start from scratch but start from known ideas that they read.
If the task is to write chess programs when people start from no idea that they read then you can expect more difference but if people learn about bitboards and learn tricks to write faster firstone function then you cannot
blame them for not being original and writing a slower code.
My FirstOne() function has been copied by many. It is simply a way of accessing the single instruction BSF. I have repeatedly said that I am not talking about a single line of code. But _blocks_ of identical lines. That is a difference. The examples shown have been 200 lines and up. that will _not_ happen "innocently".
firstone is only one example so it does not seem obvious to me that few hundrends of lines of equivalent code is a proof for using copy and paste
without looking at the code(I need to look at the lines to decide).
Here is an example about different bitboard programs that is not the case of fruit and rybka because fruit is not a bitboard program.
Based on looking at the code of strelka
one bitboard dictate many bitboards so if one bitboard is the same many bitboards are going to be the same.
If you use A1=0,B1=1,...H1=7,A2=8,...H8=63 you can expect arrays of bitboard of the squares that the king control to be the same and you have 64 numbers(squares that the king control at A1 squares that the king control at B1,....)
In writing a book, you use the letters a-z also. We are not talking about copying individual characters. Or individual numbering schemes, of which there are a finite (and small) number of alternatives (four for numbering squares using bitboards for example. So let's get back to the topic at hand. Duplicate blocks of code in the engine. we are not talking about silly examples like an array of numbers that equate specific chess squares to specific bits. We are talking about parts of an engine, such as the search, the evaluation, communication with the outside world, etc. Simple functions that convert a square number to a two-character algebraic coordinate and such is not what is being discussed.
Same is for squares that the knight control and squares that the pawns control and for bitboards that tell you information about blocking squares and continue in this way.
All these constants can be easily hundreds of lines of code so you get hundrends of lines that are basically the same.
Nobody is discussing such arrays of constants, this has been dismissed in discussions about things like endgame tables where everybody is using the same exact everything. Arrays used in evaluation are a different thing because those are creative, rather than just enumerations.
You can have different functions to generate them but
the programmer may simply use constant array and in this case the constant array is the same.
Uri
And nobody cares about that kind of similarity. Everyone uses 1, 2, 3, etc as well. We are looking one level higher than that.
My point is that we do not know what are the common parts between rybka and fruit and it may be possible that there are some common parts of strelka and fruit that are not in rybka.
The example of bitboards was only to show that number of similiar lines prove nothing.
I also think that you can expect bigger similiarity between good programmers because when there are less ways to write a strong chess program relative to number of ways to write a weak chess program
espacially when programmers start from some known ideas and do not start from nothing.
Uri
I'd like to see the discussion remain "on track". The question being discussed is "is Rybka derived from Fruit?" If there is any part of the code from Fruit in Rybka, the answer is "yes". Vas added strelka to the mix by his claim that strelka was identical to the first version of Rybka.
As far as the "less ways" that is simply wrong. There are an _infinite_ number of ways to write a strong chess program without having a single identical line of code between any two examples. Having large chunks of duplicate code is not going to happen unless the programs are related as is being discussed.
I wish we could stop this nonsense of "just a few ways". Lincoln's famous address is not very long. How many different ways could you write text to say the same thing? At least a million? And that pales to the amount of text in a chess engine's instructions. This concept is complete and utter nonsense. And anybody with _any_ significant programming experience realizes that.
It is obvious that Rybka is not identical to strelka because they do not generate exactly the same output but only similiar output.
And who cares? Have you seen anyone say "Rybka and Fruit are identical?" Or have you seen "Rybka has a lot of similarities with Fruit and might have been derived from it?" Nobody says they have identical output, play identical moves or anything else. Again, the discussion gets side-tracked from the original point.
Vas said:
"Strelka contains Rybka code. Whether Strelka also contains Fruit code, I don't know and don't really care."
see the following link for the quote of Vas words
http://rybkaforum.net/cgi-bin/rybkaforu ... 2#pid99683
It means that you need to prove identical code between fruit and rybka without strelka.
"Identical blocks of code". That is enough to violate GPL. But even then your statement is false. If Strelka contains Rybka code, and Strelka contains the _same_ Fruit code, then once again, the GPL has been violated. I have never seen the likes of trying to twist every last word to have some different meaning than the original author intended.
I think that it was not proved and I do not agree that every identical code prove that Vas is quilty and you cannot decide only based on the size
of the identical code and you need to look at the relevant lines.
No you don't. Fortunately, you don't get to define the terms of the GPL. They are already set in stone, having been drawn up by lawyers and tested in the courts. It doesn't matter _what_ was copied. If anything is copied, it violates the GPL. Do I care about that? Not really. I am more interested in the parts that play chess. But Christophe was pointedly talking about GPL and a potential violation. Any copied code satisfies that condition.
I agree that you cannot expect 2 good programs to be the same but it is not illogical to expect more similiarities between good original programs relative to the case of bad original programs.
So? The probability of multiple identical blocks of code in two good programs is vanishingly small. Does it matter than the probability for two bad programs is even smaller? Small is small in this context.
The idea (numbers may be different) is that
there may be 10^10000 ways to write bad chess program and 10^400 ways to write a good chess program.
A second-year computer science student can prove that there are an infinite number of ways to write a computer chess program (or any program for that matter). It is a consequence of understanding the basic building blocks of grammar such as regular expressions.
My opinion is that progams that are the same or almost the same are not likely to appear but program that have few components that are the same or almost the same are more likely to appear with good programs.
Uri
OK. The probability of two bad programs having the same blocks of code is (say) one in 100,000,000,000. The probability of two good programs having the same blocks of code is one in 10,000,000,000. Happy now? Both are essentially a probability of zero.