Similarity Report 2019

Discussion of anything and everything relating to chess playing software and machines.

Moderators: bob, hgm, Harvey Williamson

Forum rules
This textbox is used to restore diagrams posted with the [d] tag before the upgrade.
User avatar
Harvey Williamson
Posts: 1820
Joined: Sun May 25, 2008 9:12 pm
Location: Media City, UK
Contact:

Re: Similarity Report 2019

Post by Harvey Williamson » Sun Sep 29, 2019 8:15 pm

Ovyron wrote:
Sun Sep 29, 2019 7:27 pm
Then I think it should be moved back to General, EO threads should be only about source code discussions (as the description of the subforum says.)
And this thread should possibly be moved to forum help and suggestions.

I questioned the setting up of the Engine origins forum all those years ago... Posts are either allowed by the charter or not. However we have the EO forum now and that forum is, probably, where the thread belongs. Should the forum be closed? Maybe but that is not a decision for the moderators. You could start a pole in the help and suggestions forum and maybe the forum admins would act on it!?

User avatar
hgm
Posts: 23793
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Similarity Report 2019

Post by hgm » Sun Sep 29, 2019 8:27 pm

Ovyron wrote:
Sun Sep 29, 2019 7:27 pm
EO threads should be only about source code discussions (as the description of the subforum says.)
A ridiculous suggestion.

There cannot be much discussion about engines for which the source code is available. The whole purpose of such discussions is to expose code thiefs who release executables only, pretending they are their own.

dannyb
Posts: 60
Joined: Mon Jul 09, 2018 4:08 pm
Full name: Daniel Bennett

Re: Similarity Report 2019

Post by dannyb » Sun Sep 29, 2019 8:48 pm

chrisw wrote:
Sun Sep 29, 2019 9:55 am
The data speaks for itself, both positively and negatively.
Exactly and since the data tell about the origins of the engines, this thread belongs to the Engines Origins subforum especially since this is not a certified tool of any kind and no one knows exactly how to interpret the results. Someone said the line should be drawn at 60% and yet Crafty and Fruit have a 60% sim result. Anyone that can read the source code can see how totally unrelated they are.

I've taken a look in the Engine Origins subforum and there are many threads with similarity tests, dendograms and so on. So, such threads have always been moved there.

MikeB
Posts: 3561
Joined: Thu Mar 09, 2006 5:34 am
Location: Pen Argyl, Pennsylvania

Re: Similarity Report 2019

Post by MikeB » Sun Sep 29, 2019 8:55 pm

chrisw wrote:
Sun Sep 29, 2019 9:55 am
MikeB wrote:
Sat Sep 28, 2019 4:32 am
Ovyron wrote:
Fri Sep 27, 2019 7:51 pm
But the description of EO is:

"Discussion of the origination and/or derivation of computer chess program source code."

Was such a thing discussed in the Similarity Report 2019? Or just the similarity of the moves made, which isn't related to source code discussions?

It seems some people have made up their minds about what belongs where, even when this subforum still reads "Discussion of anything and everything relating to chess playing software and machines."

"Anything and everything" seems like a false advertisement for the last 8 years.
It's not like that - there is history here that largely goes unstated and you have to pay attention. The "Similarity Report 2019" is an euphemism for who's borrowing/stealing code from others, hence, why it was moved to the engine origins. That's what this report is about.
The Similarity Report, using an entirely unbiased engine selection process, an unbiased and established epd test suite, and produced, via transparent and verifiable engine d=1, move selection process, a correlation matrix showing percentages of same move selection across the 135 engines tested.
The Similarity Report reports and shows the results, as data. As such it is not a euphemism for anything. The data speaks for itself, both positively and negatively.
You can call it what you want, by where it was placed speaks eons.

User avatar
Rebel
Posts: 4790
Joined: Thu Aug 18, 2011 10:04 am

Re: Similarity Report 2019

Post by Rebel » Sun Sep 29, 2019 9:34 pm

Harvey Williamson wrote:
Sun Sep 29, 2019 8:15 pm
Ovyron wrote:
Sun Sep 29, 2019 7:27 pm
Then I think it should be moved back to General, EO threads should be only about source code discussions (as the description of the subforum says.)
And this thread should possibly be moved to forum help and suggestions.

I questioned the setting up of the Engine origins forum all those years ago... Posts are either allowed by the charter or not. However we have the EO forum now and that forum is, probably, where the thread belongs. Should the forum be closed? Maybe but that is not a decision for the moderators. You could start a pole in the help and suggestions forum and maybe the forum admins would act on it!?
This place is called the Computer Chess Club (CCC) since 1997, it's about computer chess in the broadest sense of the word. What you suggest is a Computer Chess Censored forum.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
Graham Banks
Posts: 33254
Joined: Sun Feb 26, 2006 9:52 am
Location: Auckland, NZ

Re: Similarity Report 2019

Post by Graham Banks » Sun Sep 29, 2019 10:35 pm

Rebel wrote:
Sun Sep 29, 2019 9:34 pm
Harvey Williamson wrote:
Sun Sep 29, 2019 8:15 pm
Ovyron wrote:
Sun Sep 29, 2019 7:27 pm
Then I think it should be moved back to General, EO threads should be only about source code discussions (as the description of the subforum says.)
And this thread should possibly be moved to forum help and suggestions.

I questioned the setting up of the Engine origins forum all those years ago... Posts are either allowed by the charter or not. However we have the EO forum now and that forum is, probably, where the thread belongs. Should the forum be closed? Maybe but that is not a decision for the moderators. You could start a pole in the help and suggestions forum and maybe the forum admins would act on it!?
This place is called the Computer Chess Club (CCC) since 1997, it's about computer chess in the broadest sense of the word. What you suggest is a Computer Chess Censored forum.
It is good to have a sub-forum where engine origins or similarities can be discussed. No reason to keep it hidden from non-members though.
My email addresses:
gbanksnz at gmail.com
gbanksnz at yahoo.co.nz

User avatar
Ovyron
Posts: 2837
Joined: Tue Jul 03, 2007 2:30 am

Re: Similarity Report 2019

Post by Ovyron » Sun Sep 29, 2019 11:20 pm

I guess I'm just being nitpicky about the description of the subforum there, if you remove "source code", it reads "Discussion of the origination and/or derivation of computer chess programs", then a similarity report would fit there.

I guess discussing whether EO is needed at all, or if some threads are getting censored (for people like me that mostly only visit General and miss it - and specially, it IS censorship for guests and google results) would be a different topic.
Great spirits have always encountered violent opposition from mediocre minds.

chrisw
Posts: 2209
Joined: Tue Apr 03, 2012 2:28 pm

Re: Similarity Report 2019

Post by chrisw » Mon Sep 30, 2019 8:24 am

dannyb wrote:
Sun Sep 29, 2019 8:48 pm
chrisw wrote:
Sun Sep 29, 2019 9:55 am
The data speaks for itself, both positively and negatively.
Exactly and since the data tell about the origins of the engines, this thread belongs to the Engines Origins subforum especially since this is not a certified tool of any kind
I'ld be intriged to know what is a "certified tool" and where the "certifying authority" is located.

I can certify, however that we applied the scientific method:

The Similarity Report, using an entirely unbiased engine selection process, an unbiased and established epd test suite, and produced, via transparent and verifiable engine d=1, move selection process, a correlation matrix showing percentages of same move selection across the 135 engines tested.

The testing data is known, the engines are known, the procedure is known, the process is repeatable and verifiable.
and no one knows exactly how to interpret the results. Someone said the line should be drawn at 60% and yet Crafty and Fruit have a 60% sim result.
The graphic presentation of Force Directed network graphs as a 'time' sequence based on similarity, plus colour coding by Elo, allows the observer to get his/her own feel for the results data, and gets away from the arbitrary drawing of lines in the sand that have been used in the past to classify as derivative or clone or whatever.

Anyone that can read the source code can see how totally unrelated they are.
The problem with source code comparison, compared to similarity comparison by results, is several fold:
1. it is an inherently experimenter-biased process.
2. It can't compare everything, so there is little big-picture comparison (which we achieve via Simex), and bias in the choice of comparator engines.
3. It is highly subjective.
4. It is heavily biased to protect experienced programmers against the less experienced. Experienced programmers using ideas from other programs will, by merit of their experience, be coding in their own style which will likely look nothing like the style of the used-idea engine, and may well find ways of incorporating the used-idea into some already coded other idea. Inexperienced programmers are more likely to be influenced by the coding structure of the place they found the idea in, and this will more likely reflect in the resulting code.
5. Simex is not interested in the actual coding of ideas, it detects (quite sensitively imo) usage of comparable ideas and the tying those ideas together in comparable ways.
6. Conversely, it also detects the opposite process, the usage or addition of original ideas which unsurprisingly reduce similarity.
7. Finally, Simex also contains information about what engine series an engine evaluation is NOT linked/connected to. This is revealed in the big picture analysis which we are able to show in Force directed network graphs.

I've taken a look in the Engine Origins subforum and there are many threads with similarity tests, dendograms and so on. So, such threads have always been moved there.

User avatar
Rebel
Posts: 4790
Joined: Thu Aug 18, 2011 10:04 am

Re: Similarity Report 2019

Post by Rebel » Mon Sep 30, 2019 8:46 am

dannyb wrote:
Sun Sep 29, 2019 8:48 pm
...
especially since this is not a certified tool of any kind and no one knows exactly how to interpret the results.
With your 7 posts here so far you might have missed a lot of history. The SIM-test is around and in use for almost 10 years by now and its possibilities and impossibilities have been discussed in great detail.
90% of coding is debugging, the other 10% is writing bugs.

User avatar
hgm
Posts: 23793
Joined: Fri Mar 10, 2006 9:06 am
Location: Amsterdam
Full name: H G Muller
Contact:

Re: Similarity Report 2019

Post by hgm » Mon Sep 30, 2019 2:11 pm

Ovyron wrote:
Sun Sep 29, 2019 11:20 pm
I guess I'm just being nitpicky about the description of the subforum there, if you remove "source code", it reads "Discussion of the origination and/or derivation of computer chess programs", then a similarity report would fit there.
I admit that the addition "source code" in the description is strange, because it seems redundant. Programs always consist of source code and object code; no one programs anymore by poking hex machine code into memory. And the object code is never derived from anything else than the corresponding source code. So if you discuss where the source code is coming from, you will also be discussing where the object code (beyond the trivialcompilation step) is coming from.

Locked