Is any competent one here?? Correct the RYBKA libels!

Discussion of chess software programming and technical issues.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Romy
Posts: 72
Joined: Thu Mar 10, 2011 9:39 pm
Location: Bucharest (Romania)

Is any competent one here?? Correct the RYBKA libels!

Post by Romy » Thu Mar 17, 2011 12:53 am

Is any competent one here? If so please correct the misperceptions among less competent and incompetent, that are giving fuel to the campaign against Mr Rajlich. Not all the programmer are high-level only, some must have insights?

I think it has been a simple mistake, which grow into a hysteria mob, so even very intelligent balanced reasonable become drawn in.

Here follow the technicals.

It is a demonstrable fact that compilation with the best compilers is a many-to-one process. Not a one-to-one process.

This means many different (even, very different!) sources can compile to the same object, because these sources are (can be hard for human to see) functionally congruent. (Even if they are not, if there is a compiler bug)

The more advanced is the compiler, the more pronounced is the effects.

The more complex the source, the more scope for illustration application.

Conclusions:

Decompilation produces a source, not the source.

Compilation is, for any substantial code chunk, irreversible.

This is because compilation is many-to-one, so potentially decompilation is one-to-many.

So widely different source chunks put through same compiler with the same libraries can produce exactly identical machine code chunk.

Does the computerchess community disgrace to itself by failing to recognise the same, and permit the matter to continue further? Baboon turn into kangaroo court will only be temporary, eventually a real court can cause a lot of trouble.

EXAMPLE FOR THOSE WHO ARE NOT COMPETENT TO UNDERSTAND.

I have five sources in C, about 75 lines each, and complex++, no connection to chess.

Call them P,Q,R,S,T.

Sorry, I will not give them to you yet. It waits till the court.

I have an undisclosed but single compiler, verifiable, advanced, some extra libraries.

Three of the sources will compile to identical object. Two others will not. It can be done by a third party.This is enough to destroy any case against Mr Rajlich, as it proves there is no reversibility possible.

But I go further even!

I challenge to human expert panel to separately examine without compiler assistance P,Q,R,S,T and identify in one day which three are congruent in output function and which are not. They have 1 in 10 chance of getting correct by mere flukery, so it must be panel of three working separately. Then with 1 in 1000 I am happy.

Thank you.

User avatar
Romy
Posts: 72
Joined: Thu Mar 10, 2011 9:39 pm
Location: Bucharest (Romania)

Re: Is any competent one here?? Correct the RYBKA libels!

Post by Romy » Thu Mar 17, 2011 12:54 am

Above post is because understanding demonstrated in
http://www.talkchess.com/forum/viewtopic.php?t=38432
in the general forum is unsufficient.

bob
Posts: 20916
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Is any competent one here?? Correct the RYBKA libels!

Post by bob » Thu Mar 17, 2011 1:26 am

Romy wrote:Is any competent one here? If so please correct the misperceptions among less competent and incompetent, that are giving fuel to the campaign against Mr Rajlich. Not all the programmer are high-level only, some must have insights?

I think it has been a simple mistake, which grow into a hysteria mob, so even very intelligent balanced reasonable become drawn in.

Here follow the technicals.

It is a demonstrable fact that compilation with the best compilers is a many-to-one process. Not a one-to-one process.
However, compiling to assembly from C is a many to 1 assembly conversion, and a many to 1 semantic conversion. What is the probability that two different programmers write two different pieces of code, yet the two pieces turn out to be identical semantically. That is they do the same operations on the same data, so that if you run either piece of code with the same inputs, the same physical operations on that data are done to produce the same output?

Near zero, that's the probability.

What is the chance that programmer A writes a piece of code that has a really odd bug in it, that programmer A can explain _exactly_ why and when it was done and how it was left in by mistake. Programmer B writes a completely different source, but when compiled, produces the same semantic function as Programmer A's code produced. And in that code one finds the _same_ bug?

This is total bullshit to suggest that someone can't compare a high-level language to an assembly language program and prove beyond any doubt that they are semantically equivalent. And two different programmers will not write two significant pieces of code that _are_ semantically equivalent. So this pig won't fly, and it is pointless to continue this particular debate.

You simply don't know what you are talking about, or else you are trying to be cute and just divert attention from the actual examination that is in progress.


This means many different (even, very different!) sources can compile to the same object, because these sources are (can be hard for human to see) functionally congruent. (Even if they are not, if there is a compiler bug)
What is the probability of writing two different 100 line blocks of C code, independently by two different programmers, and have them produce the same optimized assembly language (and by inference, identical semantics)?

Close enough to zero to call it so.

What about repeating this several times in different pieces of the same 50K line program? Closer to zero.

What about both writing code that looks different, but is semantically equivalent, with the _same_ bug or unnecessary code? We are now approaching 1 / infinity. Mathematicians call this "zero" exactly.


The more advanced is the compiler, the more pronounced is the effects.

The more complex the source, the more scope for illustration application
And the lower the probability they produce the same assembly code, aka semantic equivalence. You are digging a hole too deep to climb out of here.

Conclusions:

Decompilation produces a source, not the source.

Compilation is, for any substantial code chunk, irreversible.[/b



If it were only about the source, perhaps. But it is about semantic equivalence, which is _very_ precisely defined.


This is because compilation is many-to-one, so potentially decompilation is one-to-many.

So widely different source chunks put through same compiler with the same libraries can produce exactly identical machine code chunk.


"Theoretically possible". Practically impossible. And when you factor in identical bugs or unnecessary code in both programs, totally impossible.
But keep digging.



Does the computerchess community disgrace to itself by failing to recognise the same, and permit the matter to continue further? Baboon turn into kangaroo court will only be temporary, eventually a real court can cause a lot of trouble.

EXAMPLE FOR THOSE WHO ARE NOT COMPETENT TO UNDERSTAND.

I have five sources in C, about 75 lines each, and complex++, no connection to chess.

Call them P,Q,R,S,T.

Sorry, I will not give them to you yet. It waits till the court.

I have an undisclosed but single compiler, verifiable, advanced, some extra libraries.

Three of the sources will compile to identical object. Two others will not. It can be done by a third party.This is enough to destroy any case against Mr Rajlich, as it proves there is no reversibility possible.



Still bullshit, but don't let me stop you. Once you dig deep enough we won't have to deal with you any longer.

But I go further even!

I challenge to human expert panel to separately examine without compiler assistance P,Q,R,S,T and identify in one day which three are congruent in output function and which are not. They have 1 in 10 chance of getting correct by mere flukery, so it must be panel of three working separately. Then with 1 in 1000 I am happy.

Thank you.


Why "within one day"? More bullshit? This has been going on for a couple of years. It is not a fast process. But it is a straightforward process.

If you give me the 5 compiled versions, and enough time, I'll take the challenge by myself, and if I am successful, will you go away for all time?

your infantile (in one day, only one person working on it, etc) are lame attempts to make it harder to do. We don't just have "one person". And we haven't just spent "one day". I've never seen such a stupid post in my entire life, where one is pretending to be competent, and offering arguments that do not address the process being carried out in any shape or fashion, yet you try to convince the casual reader that (a) you know what you are talking about; (b) our conclusions are flawed; and (c) computer scientists support your claim. The only thing a computer scientist will support is that yes, going from asm to C is a 1 to many mapping. But going from a C to ASM is not. That's the flaw in your ointment. The asm expresses the semantics of the C code. It makes copying obvious to the casual observer, once it is laid out.

User avatar
Romy
Posts: 72
Joined: Thu Mar 10, 2011 9:39 pm
Location: Bucharest (Romania)

Re: Is any competent one here?? Correct the RYBKA libels!

Post by Romy » Thu Mar 17, 2011 1:50 am

bob wrote:divert attention from the actual examination that is in progress.
Contaminated examination due to connections of participants?
What is the probability of writing two different 100 line blocks of C code, independently by two different programmers, and have them produce the same optimized assembly language (and by inference, identical semantics)?Close enough to zero to call it so.
Respectfuley, that is nonsense.

It depend wholly and holey on the brief to programmers. If one is writing a program to evaluate mobility and other to count beans then the probability is equivalenced zero. But if both are writing a null search (extended type 2a) probability is hugely incremented.
"Theoretically possible". Practically impossible.
Stored for use.
Once you dig deep enough we won't have to deal with you any longer.
Is censorship and banning already necessary?
Usually some preliminaries, like your losing the argument?
Why "within one day"?
Because you could hand compile, given a month. But I agree. I give you a week.
If you give me the 5 compiled versions, and enough time, I'll take the challenge by myself, and if I am successful, will you go away for all time?
Again you are not in concentration!
The 5 versions P,Q... are SOURCE not COMPILED!

The sources will look and smell different. Even very differenced. Maybe the ones which look most different will compile to identical objects!

The point was three of them will not only produce identical output results when compiled with any compliant compiler, but also if juiced by a special compiler will compile to identical object codes. Your job will be to find which 3 of the 5.

And I need panel of 3, because 1 in 10 makes a fluke possible.
The asm expresses the semantics of the C code. It makes copying obvious to the casual observer, once it is laid out.
Pardon, but you are very underestimating of compiler sophistication. The asm may be auto-optimised to the degree of unrecognisability. If SMP involved, more so.

In the day of Cray and HiTech it was different, a compiler was just a little more than an assembler. But RYBKA is of 2005-6, not 1616 or 1986.

User avatar
Romy
Posts: 72
Joined: Thu Mar 10, 2011 9:39 pm
Location: Bucharest (Romania)

Re: Is any competent one here?? Correct the RYBKA libels!

Post by Romy » Thu Mar 17, 2011 3:10 am

bob wrote:
Romy wrote:It is a demonstrable fact that compilation with the best compilers is a many-to-one process. Not a one-to-one process.
a computer scientist will support is that yes, going from asm to C is a 1 to many mapping.
Thank you for this admission.

It did not come out earlier among the learned commentators. Better they see it now, when they have access to brake function and gear, then after the irreversibility line is crossed.
But going from a C to ASM is not.
Well, with a given compiler and given settings, it is not. Else, you are wrong.
That's the flaw in your ointment.
It is flaw or fly in someone else's balm, but it is irrelevant to mine.
The asm expresses the semantics of the C code.
Aha.
Are you unclear about the meanings of syntax and grammar, with application to C compilation process?

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 3:03 pm
Location: British Columbia, Canada

Re: Is any competent one here?? Correct the RYBKA libels!

Post by wgarvin » Thu Mar 17, 2011 3:29 am

Don't feed the trolls...

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 3:03 pm
Location: British Columbia, Canada

Re: Is any competent one here?? Correct the RYBKA libels!

Post by wgarvin » Thu Mar 17, 2011 4:26 am

Actually I'm bored, so I'll reply to some of the points above.
Romy wrote:It is a demonstrable fact that compilation with the best compilers is a many-to-one process. Not a one-to-one process.

This means many different (even, very different!) sources can compile to the same object, because these sources are (can be hard for human to see) functionally congruent. (Even if they are not, if there is a compiler bug)
Sort of true, but irrelevant. Nobody is trying to decompile anything into some "original" version of the source code. The goal is only to compare the semantics of compiled code with open source programs to see whether obvious copying has occurred.
Romy wrote:Decompilation produces a source, not the source.
True, but irrelevant.
Romy wrote:Compilation is, for any substantial code chunk, irreversible.
This is incorrect. Its tedious, and in some cases it may be difficult, but any code produced by a compiler can (with enough effort) be turned back into source code which is semantically identical to the compiled code. I think the source code of Strelka 2.0 is a good example of this.

But anyway, if you just want to compare the semantics of a compiled program to something, decompiling it is not necessary -- you can work out the semantics of the program just by studying a disassembly of it.
Romy wrote:This is because compilation is many-to-one, so potentially decompilation is one-to-many.
True, but irrelevant.
Romy wrote:So widely different source chunks put through same compiler with the same libraries can produce exactly identical machine code chunk.
Only if those "widely different" source chunks have essentially the same run-time semantics. Now that might occur by chance for small pieces of code, or pieces of code whose inputs and outputs are constrained by functional requirements. But most code in a chess engine is not like that, and the programmer has plenty of freedom to express his creativity in how he writes it. The evaluation function, for example. The chance of two programmers independently writing their own two evaluation functions which end up performing almost exactly the same computation, and have almost the exact same set of evaluation features, computed in almost the exact same order, is practically zero.
Romy wrote:Does the computerchess community disgrace to itself by failing to recognise the same, and permit the matter to continue further? Baboon turn into kangaroo court will only be temporary, eventually a real court can cause a lot of trouble.

EXAMPLE FOR THOSE WHO ARE NOT COMPETENT TO UNDERSTAND.

I have five sources in C, about 75 lines each, and complex++, no connection to chess.

Call them P,Q,R,S,T.

Sorry, I will not give them to you yet. It waits till the court.

I have an undisclosed but single compiler, verifiable, advanced, some extra libraries.

Three of the sources will compile to identical object. Two others will not. It can be done by a third party.This is enough to destroy any case against Mr Rajlich, as it proves there is no reversibility possible.

But I go further even!

I challenge to human expert panel to separately examine without compiler assistance P,Q,R,S,T and identify in one day which three are congruent in output function and which are not. They have 1 in 10 chance of getting correct by mere flukery, so it must be panel of three working separately. Then with 1 in 1000 I am happy.

Thank you.
Nice straw man challenge thing, there. Has nothing to do with computer chess, nothing to do with the investigations.

Could we knock it over in only one day?? Or would it just be an irrelevant waste of time? Wait, don't tell me, I can guess.

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 3:03 pm
Location: British Columbia, Canada

Re: Is any competent one here?? Correct the RYBKA libels!

Post by wgarvin » Thu Mar 17, 2011 5:08 am

Romy wrote:
bob wrote:The asm expresses the semantics of the C code. It makes copying obvious to the casual observer, once it is laid out.
Pardon, but you are very underestimating of compiler sophistication. The asm may be auto-optimised to the degree of unrecognisability. If SMP involved, more so.

In the day of Cray and HiTech it was different, a compiler was just a little more than an assembler. But RYBKA is of 2005-6, not 1616 or 1986.
All modern optimizing compilers are quite "sophisticated" (although the ones used to compile Rybka back in 2005-6 were not as good as they are today), but this "auto-optimised to the degree of unrecognisability" is nonsense.

Anyone who understands how compiler optimizers work and knows how to read assembly, should be able to compare a short segment of source code with a short listing of assembly instructions and draw a conclusion about whether they do the same calculation or not.

The effects of most compiler optimizations are relatively simple to understand, even if the implementation of the compiler is quite complicated and difficult. Every optimizing compiler folds constants, does CSE, strength reduction, inlining, and loop optimizations (peeling, unrolling, hoisting invariants, etc.) Modern ones use SSA form and do more aggressive things like partial-redundancy elimination, pointer alias analysis. It all sounds complex until you realize that the compiled program still has to compute the same results that you asked it to compute in your source code. It can move the computations around a bit, and do them in a smarter way, but generally it still has to do them.

If you write some short programs and compile them and look at the instructions the compiler actually produces, you'll get a good feel for what the compiler actually can and can't do to your code. And anyone can learn what those optimizations do without having to learn how they actually do it.

"constant folding": it replaces things like (1 + 3 + 5) with the (9) at compile time. Most compilers do this even in debug builds because it makes the compilation process faster.
"CSE": Common subexpression elimination. If it can figure out that you asked it to do the same computation twice, it will just compute it once and use the result in both places.
"strength reduction": things like (a * 4) get replaced with (a << 2) if that generates faster code. div by constant gets converted into mul by constant, etc.
"loop-invariant code motion": it finds calculations inside the loop body that would just produce the same result every time, and hoists them above the loop so they only have to be done once.
"loop induction": it can replace the variable(s) or address expression(s) that change by a fixed amount on each iteration of a loop (such as a counter, or some array being accessed inside the loop) with some other expression which is cheaper for it to compute. If you write for (int i=0; i<size; i++) and then in the body you index an array of 8-byte structures, it might use (i*8) instead. It might even re-write it like for (t = -(i*8); t != 0; t += 8) so it can take advantage of super-cheap t != 0 test. etc.
"partial redundancy elimination": if a calculation is made in a basic block which is common to more than one possible code path, and then the result is used on one of these paths but not on all of them, it might decide to move the computation so that its only performed on the paths where its needed.

Anyways the point is, compiler output is only surprising if you don't know anything about optimizing compilers. Or I guess, if you expect it to optimize something that seems obvious to you but it fails to do so...

User avatar
Romy
Posts: 72
Joined: Thu Mar 10, 2011 9:39 pm
Location: Bucharest (Romania)

Re: Is any competent one here?? Correct the RYBKA libels!

Post by Romy » Thu Mar 17, 2011 12:46 pm

wgarvin wrote: much, three-times
Thank you.

For your concession that there be as much "evidence" of wrongdoing against Mr Rajlich as there is against the author/s of the compiler/s he used.

Reply please only after you have comprehended.

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 3:03 pm
Location: British Columbia, Canada

Re: Is any competent one here?? Correct the RYBKA libels!

Post by wgarvin » Thu Mar 17, 2011 2:23 pm

Romy wrote:
wgarvin wrote: much, three-times
Thank you.

For your concession that there be as much "evidence" of wrongdoing against Mr Rajlich as there is against the author/s of the compiler/s he used.

Reply please only after you have comprehended.
Huh? What are you talking about, and why are you putting words in my mouth? You ignored the substance of replies to continue pushing your specific agenda.

You registered on the board only a week ago and you've made almost fifty posts about it already. There's a name for this, its called "trolling".

Locked