A Common Sense Proposal to all Vas & Rybka Doubters

bob · Post by **bob** » Tue Aug 26, 2008 8:14 pm

Uri Blass wrote:
bob wrote:
Uri Blass wrote:
bob wrote:
Uri Blass wrote:
bob wrote:
geots wrote:
Dirt wrote:
geots wrote:Bob
He's not really a primary player in this at this point. It should be left to Zach and the others who are gathering evidence to present it.

After the stuff i read that he said today, i was shocked. I want to see what he says after a one on one with Vas- sort of a Rybka- Crafty matchup, so to speak.
What have I said that is shocking? That you will not "accidentally" produce a couple of hundred lines of identical code here, a couple of hundred lines there? That is all I have stated from the beginning. _IF_ there is duplicate code to any significant extent (not single lines, but blocks of code) then there is no way it is an "accident". If that is shocking, not much I can say. Anybody that deals with large numbers of programming assignments on a regular basis will say the same.
This is no accident but also proves nothing.
No accident because programmers do not start from scratch but start from known ideas that they read.

If the task is to write chess programs when people start from no idea that they read then you can expect more difference but if people learn about bitboards and learn tricks to write faster firstone function then you cannot
blame them for not being original and writing a slower code.

My FirstOne() function has been copied by many. It is simply a way of accessing the single instruction BSF. I have repeatedly said that I am not talking about a single line of code. But _blocks_ of identical lines. That is a difference. The examples shown have been 200 lines and up. that will _not_ happen "innocently".

firstone is only one example so it does not seem obvious to me that few hundrends of lines of equivalent code is a proof for using copy and paste
without looking at the code(I need to look at the lines to decide).

Here is an example about different bitboard programs that is not the case of fruit and rybka because fruit is not a bitboard program.

Based on looking at the code of strelka
one bitboard dictate many bitboards so if one bitboard is the same many bitboards are going to be the same.
If you use A1=0,B1=1,...H1=7,A2=8,...H8=63 you can expect arrays of bitboard of the squares that the king control to be the same and you have 64 numbers(squares that the king control at A1 squares that the king control at B1,....)
In writing a book, you use the letters a-z also. We are not talking about copying individual characters. Or individual numbering schemes, of which there are a finite (and small) number of alternatives (four for numbering squares using bitboards for example. So let's get back to the topic at hand. Duplicate blocks of code in the engine. we are not talking about silly examples like an array of numbers that equate specific chess squares to specific bits. We are talking about parts of an engine, such as the search, the evaluation, communication with the outside world, etc. Simple functions that convert a square number to a two-character algebraic coordinate and such is not what is being discussed.

Same is for squares that the knight control and squares that the pawns control and for bitboards that tell you information about blocking squares and continue in this way.
All these constants can be easily hundreds of lines of code so you get hundrends of lines that are basically the same.
Nobody is discussing such arrays of constants, this has been dismissed in discussions about things like endgame tables where everybody is using the same exact everything. Arrays used in evaluation are a different thing because those are creative, rather than just enumerations.

You can have different functions to generate them but
the programmer may simply use constant array and in this case the constant array is the same.

Uri
And nobody cares about that kind of similarity. Everyone uses 1, 2, 3, etc as well. We are looking one level higher than that.
My point is that we do not know what are the common parts between rybka and fruit and it may be possible that there are some common parts of strelka and fruit that are not in rybka.

The example of bitboards was only to show that number of similiar lines prove nothing.

I also think that you can expect bigger similiarity between good programmers because when there are less ways to write a strong chess program relative to number of ways to write a weak chess program
espacially when programmers start from some known ideas and do not start from nothing.

Uri
I'd like to see the discussion remain "on track". The question being discussed is "is Rybka derived from Fruit?" If there is any part of the code from Fruit in Rybka, the answer is "yes". Vas added strelka to the mix by his claim that strelka was identical to the first version of Rybka.

As far as the "less ways" that is simply wrong. There are an _infinite_ number of ways to write a strong chess program without having a single identical line of code between any two examples. Having large chunks of duplicate code is not going to happen unless the programs are related as is being discussed.

I wish we could stop this nonsense of "just a few ways". Lincoln's famous address is not very long. How many different ways could you write text to say the same thing? At least a million? And that pales to the amount of text in a chess engine's instructions. This concept is complete and utter nonsense. And anybody with _any_ significant programming experience realizes that.
It is obvious that Rybka is not identical to strelka because they do not generate exactly the same output but only similiar output.

And who cares? Have you seen anyone say "Rybka and Fruit are identical?" Or have you seen "Rybka has a lot of similarities with Fruit and might have been derived from it?" Nobody says they have identical output, play identical moves or anything else. Again, the discussion gets side-tracked from the original point.

Vas said:

"Strelka contains Rybka code. Whether Strelka also contains Fruit code, I don't know and don't really care."

see the following link for the quote of Vas words
http://rybkaforum.net/cgi-bin/rybkaforu ... 2#pid99683

It means that you need to prove identical code between fruit and rybka without strelka.

"Identical blocks of code". That is enough to violate GPL. But even then your statement is false. If Strelka contains Rybka code, and Strelka contains the _same_ Fruit code, then once again, the GPL has been violated. I have never seen the likes of trying to twist every last word to have some different meaning than the original author intended.

I think that it was not proved and I do not agree that every identical code prove that Vas is quilty and you cannot decide only based on the size
of the identical code and you need to look at the relevant lines.

No you don't. Fortunately, you don't get to define the terms of the GPL. They are already set in stone, having been drawn up by lawyers and tested in the courts. It doesn't matter _what_ was copied. If anything is copied, it violates the GPL. Do I care about that? Not really. I am more interested in the parts that play chess. But Christophe was pointedly talking about GPL and a potential violation. Any copied code satisfies that condition.

I agree that you cannot expect 2 good programs to be the same but it is not illogical to expect more similiarities between good original programs relative to the case of bad original programs.

So? The probability of multiple identical blocks of code in two good programs is vanishingly small. Does it matter than the probability for two bad programs is even smaller? Small is small in this context.

The idea (numbers may be different) is that
there may be 10^10000 ways to write bad chess program and 10^400 ways to write a good chess program.

A second-year computer science student can prove that there are an infinite number of ways to write a computer chess program (or any program for that matter). It is a consequence of understanding the basic building blocks of grammar such as regular expressions.

My opinion is that progams that are the same or almost the same are not likely to appear but program that have few components that are the same or almost the same are more likely to appear with good programs.

Uri

OK. The probability of two bad programs having the same blocks of code is (say) one in 100,000,000,000. The probability of two good programs having the same blocks of code is one in 10,000,000,000. Happy now? Both are essentially a probability of zero.

geots · Post by **geots** » Tue Aug 26, 2008 8:16 pm

bob wrote:
chrisw wrote:
Rolf wrote:
chrisw wrote:
bob wrote:
PauloSoare wrote:Well said, Uri. And a magic word: "bitboards".

Paulo Soares
Perhaps "well said" but also "completely irrelevant. We all count using digits 0-9. We write using characters a-z and A-Z. We are not talking about copying an array that converts a square number (0-63) into an algebraic coordinate (a1-h8). We are talking about blocks of code (instructions, executable, etc) that are duplicates. All this other stuff is just nonsense.
Yes, you're talking about it and talking about it and talking about it.

Since you're talking about it so much, you must obviously have many examples of these duplicated blocks of instructional and executable code.

Where are they?
Chris, the reason why the allegations from Bob are fishy, are these: a scientist like him cant Prove by million code bits that something, his conclusions, is correct. Science means, you can disprove certain false conclusions in principle with a single bit, but you cant prove that something is true. From that angle already it's all nonsense here. What Bob and his helpers needed is that they had something to deal with - from Vas side. But he cant react on these insults and evil allegations and false factual assumptions.
He's already thought of that one. Hence the talk "proof beyond reasonable doubt" and other legal language.

Bob therefore wants weight of evidence. Trouble is he hasn't come up with one factual evidential thing, yet. Apparently CT and Zach have got it, shown it to Bob, but not anyone else. Apparently it takes time to produce. However, this begs the question, why don't they reveal the blocks of duplicate code he says he's seen so far?
They have not shown me anything in private. Everything I have seen was shown _here_. yes much of it is buried in threads that are too long and contain 99% noise.

BTW the "beyond a reasonable doubt" was purely a correction to your quote. "innocent until proven guilty" is wrong. In a criminal trial, it is "innocent until prove guilty beyond a reasonable doubt." In a civil trial, it is "innocent until proven guilty by a preponderance of the evidence." It wasn't something I made up.

Jesus God, Bob- Chris presses you on exactly where the blocks of code you refer to are and the best you can come up with is: "Oh, they are in some threads somewhere buried somewhere" And we are supposed to deduce from that anything you say on this issue is credible. Jesus!

chrisw · Post by **chrisw** » Tue Aug 26, 2008 8:18 pm

bob wrote:OK. now we have three. Fruit and Rybka that are high-quality programs, and one "bare-bones this is how it is done" program. That is still less than 1%. setjmp() is a horrible approach to doing anything.

two, three, none, ten, schmen - so what?

Very weak ground.

If you think setjmp() in Fruit, Rybka, TSCP and who knows what else goes even one soupcon to establishing anything, let alone anything beyond a reasonable doubt, think again.

Zach Wegner · Post by **Zach Wegner** » Tue Aug 26, 2008 8:22 pm

chrisw wrote:setjmp() post was post #1, reproduced below.

Incorrect. Post #2.

No, I don't see it. Where does he say what you suggest, going thru line by line ... identical to here ... Rybka different ... sync again ...?

http://talkchess.com/forum/viewtopic.php?p=209411

Of course, I don't expect that this would make any difference in yours or George's ramblings.

geots · Post by **geots** » Tue Aug 26, 2008 8:33 pm

bob wrote:
Rolf wrote:
chrisw wrote:
bob wrote:
PauloSoare wrote:Well said, Uri. And a magic word: "bitboards".

Paulo Soares
Perhaps "well said" but also "completely irrelevant. We all count using digits 0-9. We write using characters a-z and A-Z. We are not talking about copying an array that converts a square number (0-63) into an algebraic coordinate (a1-h8). We are talking about blocks of code (instructions, executable, etc) that are duplicates. All this other stuff is just nonsense.
Yes, you're talking about it and talking about it and talking about it.

Since you're talking about it so much, you must obviously have many examples of these duplicated blocks of instructional and executable code.

Where are they?
Chris, the reason why the allegations from Bob are fishy, are these: a scientist like him cant Prove by million code bits that something, his conclusions, is correct. Science means, you can disprove certain false conclusions in principle with a single bit, but you cant prove that something is true. From that angle already it's all nonsense here. What Bob and his helpers needed is that they had something to deal with - from Vas side. But he cant react on these insults and evil allegations and false factual assumptions.
Any chance you can stop trying to distort the truth and respond to what is being written? I have no "helpers". I joined the discussion when I started to read nonsense about "it is quite likely that you would find identical blocks of code here and there in a program as large as a chess engine." That is patently false. Even in specific pieces of code, such as move ordering (which Chris mentioned). I suggested we take my move ordering code, and then that someone suggest some open-source program and I will post a line-by-line comparison to see if two pieces of code that accomplish the same function have any common lines. That is easy enough to do. If fruit/strelka/rybka share common blocks of code, then it must be extremely common and most programs should exhibit the same property, correct? So name a program and let's test the hypothesis. Shoot. Name 3 open source programs. I have fruit, glaurung 1/2, arasan 9/10, gnuchess 4 and 5 (5 is probably a better choice since it is bitboard as is crafty).

Just pick one or more and let's compare. If it is that common, we must be able to find a match, right? Or is that too much "science" for you and you would prefer to hand-wave and name-call instead?

Just name the programs to compare and let's go. This is far easier than Zach's and Christophe's task, because here we have real source code, they are having to disassemble/reconstruct the C source, which is time-consuming.

Bob- give up and run for cover. You said similar crap with Ruffian a while back- so similar it is eerie. Grahams' links were mysteriously removed overnight. I dont have to have a crystal ball to see the house collapsing around you. Best to go ahead and put a fork in you- you are done.

Graham Banks · Post by **Graham Banks** » Tue Aug 26, 2008 8:45 pm

geots wrote:Grahams' links were mysteriously removed overnight.

You mean these ones which were passed on to me by somebody who's been around a while?

http://www.stmintz.com/ccc/index.php?id=253512
http://www.stmintz.com/ccc/index.php?id=253022

http://www.stmintz.com/ccc/index.php?id=242504

http://www.stmintz.com/ccc/index.php?id=257057
http://www.stmintz.com/ccc/index.php?id=322171
http://www.stmintz.com/ccc/index.php?id=273304

Can't see why they would be removed?

bob · Post by **bob** » Tue Aug 26, 2008 8:49 pm

a. Where does it show....

b. <disassembly snipped>

chrisw · Post by **chrisw** » Tue Aug 26, 2008 8:50 pm

Zach Wegner wrote:
chrisw wrote:setjmp() post was post #1, reproduced below.
Incorrect. Post #2.

No, I don't see it. Where does he say what you suggest, going thru line by line ... identical to here ... Rybka different ... sync again ...?
http://talkchess.com/forum/viewtopic.php?p=209411

Of course, I don't expect that this would make any difference in yours or George's ramblings.

Ah, ok, that post - was all mixed up in the 'collaborator' thread and nobody answered it, I think. I ignored it at the time because it was a disassembly Strelka against Fruit and I'm not interested in Strelka.

However, you say the Strelka code is identical to the Rybka code? So if we compare this Strelka section with the Fruit section, it's the same as comparing Rybka - Fruit?

First sight of the listings, they don't look very similar on Bob's zoom-in method. One is written in the old C style that I use(d), with for loops and stuff, and the other in the new style I never got round to. You appear to describe functional similarities in ordering of concepts and so on, but the code is not the same. Is it?

btw, I do try and avoid rambling, and if wrong, I fess up - just like you

geots · Post by **geots** » Tue Aug 26, 2008 8:52 pm

PauloSoare wrote:What do you think about this post of Uri, Zach?

http://www.talkchess.com/forum/viewtopi ... 56&t=23258

Paulo Soares

Paulo- i want to thank you for having the guts to come on here and take on the "old line establishment". Most dont have the backbone.

bob · Post by **bob** » Tue Aug 26, 2008 8:53 pm

geots wrote:
bob wrote:
chrisw wrote:
Rolf wrote:
chrisw wrote:
bob wrote:
PauloSoare wrote:Well said, Uri. And a magic word: "bitboards".

Paulo Soares
Perhaps "well said" but also "completely irrelevant. We all count using digits 0-9. We write using characters a-z and A-Z. We are not talking about copying an array that converts a square number (0-63) into an algebraic coordinate (a1-h8). We are talking about blocks of code (instructions, executable, etc) that are duplicates. All this other stuff is just nonsense.
Yes, you're talking about it and talking about it and talking about it.

Since you're talking about it so much, you must obviously have many examples of these duplicated blocks of instructional and executable code.

Where are they?
Chris, the reason why the allegations from Bob are fishy, are these: a scientist like him cant Prove by million code bits that something, his conclusions, is correct. Science means, you can disprove certain false conclusions in principle with a single bit, but you cant prove that something is true. From that angle already it's all nonsense here. What Bob and his helpers needed is that they had something to deal with - from Vas side. But he cant react on these insults and evil allegations and false factual assumptions.
He's already thought of that one. Hence the talk "proof beyond reasonable doubt" and other legal language.

Bob therefore wants weight of evidence. Trouble is he hasn't come up with one factual evidential thing, yet. Apparently CT and Zach have got it, shown it to Bob, but not anyone else. Apparently it takes time to produce. However, this begs the question, why don't they reveal the blocks of duplicate code he says he's seen so far?
They have not shown me anything in private. Everything I have seen was shown _here_. yes much of it is buried in threads that are too long and contain 99% noise.

BTW the "beyond a reasonable doubt" was purely a correction to your quote. "innocent until proven guilty" is wrong. In a criminal trial, it is "innocent until prove guilty beyond a reasonable doubt." In a civil trial, it is "innocent until proven guilty by a preponderance of the evidence." It wasn't something I made up.

Jesus God, Bob- Chris presses you on exactly where the blocks of code you refer to are and the best you can come up with is: "Oh, they are in some threads somewhere buried somewhere" And we are supposed to deduce from that anything you say on this issue is credible. Jesus!

that is correct. If you are going to chime in, then I consider it _your_ responsibility to read the thread. Do I have to hold your hand, and walk back through all this crap a second time around. And then a third when the next person joins in? If you want to contribute, it is _your_ reponsibility to know what has been said so far, otherwise this is a hopeless task.

So "jesus god" yes you should just start at the beginning, close your mouth, put down your keyboard, read each post, and then ask questions after having seen the discussions and opinions...

I didn't post the code. I'm not going back to find it for either of you. I am not going to wipe either of your butt's either, there are some things you just have to do for yourself if you want them done.

A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters

Re: A Common Sense Proposal to all Vas & Rybka Doubters