Similarity Report 2019

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
chrisw
Posts: 2186
Joined: Tue Apr 03, 2012 2:28 pm

Re: Similarity Report 2019

Post by chrisw » Mon Sep 30, 2019 3:34 pm

hgm wrote:
Mon Sep 30, 2019 2:11 pm
Ovyron wrote:
Sun Sep 29, 2019 11:20 pm
I guess I'm just being nitpicky about the description of the subforum there, if you remove "source code", it reads "Discussion of the origination and/or derivation of computer chess programs", then a similarity report would fit there.
I admit that the addition "source code" in the description is strange, because it seems redundant. Programs always consist of source code and object code; no one programs anymore by poking hex machine code into memory. And the object code is never derived from anything else than the corresponding source code. So if you discuss where the source code is coming from, you will also be discussing where the object code (beyond the trivialcompilation step) is coming from.
Seems this and the last time you (pre-emptively?) moved a post to another forum, the noise generated by the move operation dwarfs any noise there would have been if not moved. I don’t really care if you move things or not, barring the hiding/censorship that’s involved in this case, but I do care that my thoughtful posts get all mixed up with noise.

How do you propose the Chess programming Wiki, which usually does, as it did in case Rybka many times, link to, for example, my interesting posts, or Ed Schroeder’s interesting posts, when they are in hidden forum Engine Origins? The reason they are linked to in case Rybka is because they are all on the unhidden and open Rybka forum, btw.
This censorship process is not just censoring the present, but also the present at it will be looked back on from some point in the future. In other words, your action is affecting and distorting the historical record. Maybe that’s the point.

User avatar
hgm
Posts: 23772
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Similarity Report 2019

Post by hgm » Mon Sep 30, 2019 3:59 pm

The amount of noise doesn't really matter, because it is all in the same thread. Which I created anyway to refer to the moved thread. By diverting the noise here, the original thread stays free of it.

Not sure what 'last time' you are referring to. I haven't moved any other threads recently (or perhaps at all).

bob
Posts: 20639
Joined: Mon Feb 27, 2006 6:30 pm
Location: Birmingham, AL

Re: Similarity Report 2019

Post by bob » Mon Sep 30, 2019 7:06 pm

chrisw wrote:
Mon Sep 30, 2019 8:24 am
dannyb wrote:
Sun Sep 29, 2019 8:48 pm
chrisw wrote:
Sun Sep 29, 2019 9:55 am
The data speaks for itself, both positively and negatively.
Exactly and since the data tell about the origins of the engines, this thread belongs to the Engines Origins subforum especially since this is not a certified tool of any kind
I'ld be intriged to know what is a "certified tool" and where the "certifying authority" is located.

I can certify, however that we applied the scientific method:

The Similarity Report, using an entirely unbiased engine selection process, an unbiased and established epd test suite, and produced, via transparent and verifiable engine d=1, move selection process, a correlation matrix showing percentages of same move selection across the 135 engines tested.

The testing data is known, the engines are known, the procedure is known, the process is repeatable and verifiable.
and no one knows exactly how to interpret the results. Someone said the line should be drawn at 60% and yet Crafty and Fruit have a 60% sim result.
The graphic presentation of Force Directed network graphs as a 'time' sequence based on similarity, plus colour coding by Elo, allows the observer to get his/her own feel for the results data, and gets away from the arbitrary drawing of lines in the sand that have been used in the past to classify as derivative or clone or whatever.

Anyone that can read the source code can see how totally unrelated they are.
The problem with source code comparison, compared to similarity comparison by results, is several fold:
1. it is an inherently experimenter-biased process.
2. It can't compare everything, so there is little big-picture comparison (which we achieve via Simex), and bias in the choice of comparator engines.
3. It is highly subjective.
4. It is heavily biased to protect experienced programmers against the less experienced. Experienced programmers using ideas from other programs will, by merit of their experience, be coding in their own style which will likely look nothing like the style of the used-idea engine, and may well find ways of incorporating the used-idea into some already coded other idea. Inexperienced programmers are more likely to be influenced by the coding structure of the place they found the idea in, and this will more likely reflect in the resulting code.
5. Simex is not interested in the actual coding of ideas, it detects (quite sensitively imo) usage of comparable ideas and the tying those ideas together in comparable ways.
6. Conversely, it also detects the opposite process, the usage or addition of original ideas which unsurprisingly reduce similarity.
7. Finally, Simex also contains information about what engine series an engine evaluation is NOT linked/connected to. This is revealed in the big picture analysis which we are able to show in Force directed network graphs.

I've taken a look in the Engine Origins subforum and there are many threads with similarity tests, dendograms and so on. So, such threads have always been moved there.
Wow. That is about the most ridiculous statement I have EVER read. How is it subjective? Programmers do this all the time trying to find a bug in a new version. How is it "experimenter-biased"? Comparing source code is very similar to comparing two books. Programming style has nothing to do with the c comparison. IF the person doing the comparison knows anything about coding. If he is an English major that can only compare character by character, maybe not.

Source code comparison is the ONLY method that will stand up in a court of law for copyright infringement. Shortcuts like similarity testing can be a good first approximation, maybe. Although I am still not convinced due to the Fruit / Crafty results as I actually compared the source code and the first two things I looked at were massively different. Maybe it shows similarity in evals, maybe it shows similarity is Elo. Who knows since it has not been statistically analyzed to see if there are several possible similarities that give a positive, not just evaluation. But to say this is BETTER than source code comparison is simply ridiculous. And I am pretty sure you know that.

User avatar
Ovyron
Posts: 2815
Joined: Tue Jul 03, 2007 2:30 am

Re: Similarity Report 2019

Post by Ovyron » Mon Sep 30, 2019 7:33 pm

So, this thread is talking about engine origins now, is it going to be moved to EO and a third thread be created here explaining why this one was moved?

See the problem now?
Great spirits have always encountered violent opposition from mediocre minds.

User avatar
hgm
Posts: 23772
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Similarity Report 2019

Post by hgm » Mon Sep 30, 2019 7:52 pm

As far as I am concerned it is not yet going beyond discussing the reliability of various methods for determining origins. Which, as I already wrote in the lead posting, should be perfectly OK here. Specifics of the Crafty-Fruit comparison are all in EO; it is OK to refer to that here.

I do not expect this thread to turn into a Crafty vs Fruit comparison. If it seems it is going to, I still have several options to cure it:
* I can lock this thread
* I can move individual postings in it to the existing similarity thread in EO.

User avatar
Rebel
Posts: 4787
Joined: Thu Aug 18, 2011 10:04 am

Re: Similarity Report 2019

Post by Rebel » Mon Sep 30, 2019 8:30 pm

As for a good laugh, in the 80's when I was new to the scene I was asked to handover my 6502 assembler source code to the ICGA for inspection and so I printed to whole thing and gave it to 2 ICGA officials. 10-15 minutes I was called back and they said it was okay.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
Ovyron
Posts: 2815
Joined: Tue Jul 03, 2007 2:30 am

Re: Similarity Report 2019

Post by Ovyron » Tue Oct 01, 2019 1:49 am

hgm wrote:
Mon Sep 30, 2019 7:52 pm
As far as I am concerned it is not yet going beyond discussing the reliability of various methods for determining origins. Which, as I already wrote in the lead posting, should be perfectly OK here. Specifics of the Crafty-Fruit comparison are all in EO; it is OK to refer to that here.

I do not expect this thread to turn into a Crafty vs Fruit comparison. If it seems it is going to, I still have several options to cure it:
* I can lock this thread
* I can move individual postings in it to the existing similarity thread in EO.
What about the option of moving the Similarity Report 2019 thread back into this section? Are the opinions of the 2 people that reported the thread more important than everyone else?
Great spirits have always encountered violent opposition from mediocre minds.

User avatar
Graham Banks
Posts: 33224
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: Similarity Report 2019

Post by Graham Banks » Tue Oct 01, 2019 2:41 am

Ovyron wrote:
Tue Oct 01, 2019 1:49 am
hgm wrote:
Mon Sep 30, 2019 7:52 pm
As far as I am concerned it is not yet going beyond discussing the reliability of various methods for determining origins. Which, as I already wrote in the lead posting, should be perfectly OK here. Specifics of the Crafty-Fruit comparison are all in EO; it is OK to refer to that here.

I do not expect this thread to turn into a Crafty vs Fruit comparison. If it seems it is going to, I still have several options to cure it:
* I can lock this thread
* I can move individual postings in it to the existing similarity thread in EO.
What about the option of moving the Similarity Report 2019 thread back into this section? Are the opinions of the 2 people that reported the thread more important than everyone else?
The moderators do a job, and as members, we're expected to abide by their decisions.
My email addresses:
gbanksnz at gmail.com
gbanksnz at yahoo.co.nz

MikeB
Posts: 3540
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Similarity Report 2019

Post by MikeB » Tue Oct 01, 2019 2:48 am

Graham Banks wrote:
Tue Oct 01, 2019 2:41 am
Ovyron wrote:
Tue Oct 01, 2019 1:49 am
hgm wrote:
Mon Sep 30, 2019 7:52 pm
As far as I am concerned it is not yet going beyond discussing the reliability of various methods for determining origins. Which, as I already wrote in the lead posting, should be perfectly OK here. Specifics of the Crafty-Fruit comparison are all in EO; it is OK to refer to that here.

I do not expect this thread to turn into a Crafty vs Fruit comparison. If it seems it is going to, I still have several options to cure it:
* I can lock this thread
* I can move individual postings in it to the existing similarity thread in EO.
What about the option of moving the Similarity Report 2019 thread back into this section? Are the opinions of the 2 people that reported the thread more important than everyone else?
The moderators do a job, and as members, we're expected to abide by their decisions.
Amen!

User avatar
Ovyron
Posts: 2815
Joined: Tue Jul 03, 2007 2:30 am

Re: Similarity Report 2019

Post by Ovyron » Tue Oct 01, 2019 4:50 am

Graham Banks wrote:
Tue Oct 01, 2019 2:41 am
The moderators do a job, and as members, we're expected to abide by their decisions.
Abide by the decisions of the people in charge, even if they're bad decisions. I guess the people to blame are the ones that elected those people to do the job, that's why democracy doesn't really work...

See, when confronted with the reality that this thread could have EO discussion on it, HGM thought about the possibility of curating it. Why didn't he think about curating the Similarity Report 2019 thread of posts talking about EO? If the main post on it wasn't about EO, then only the offending posts would need to be moved, and not the entire thread. Move it back and curate it. The first thing I did as a mod was branching off topics to The Flip Side.

If moderators are bad at their job, members should point out their mistakes so they improve, not just abide and bend to the mods' incompetency. As a moderator of Rybka Forum with 11 years of experience I can only say that the guys that moderate Talkchess look like newbies that don't know how to play with the toys they were given.
Great spirits have always encountered violent opposition from mediocre minds.

Locked