Similarity Detector Available

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

User avatar
Graham Banks
Posts: 41473
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Similarity Detector Available

Post by Graham Banks »

Laskos wrote:Since Miguel didn't produce the clustering diagrams for my results, I am posting the dendrograms made in SPSS.

1. For several engines we know as unrelated, and some we suspect are related. Two identical Houdinis are put to check for the noise. The distance on x-axis is important, we can empirically see that a distance more than 18-20 means unrelated, less than 18-20 suspicious of related. Noise is about 1.0-1.2. All engines at 100 ms, independently of strength.

Image

2. Adjusted for strength top engines. Houdini is at 100 ms, the rest are at larger time, adjusted for strength. The same pattern on x-axis, distance more than 18-20 means unrelated.

Image

Kai
Hi Kai,

fot the mathematically challenged, could you please explain in clear, simple terms what your charts show?

Thanks,
Graham.
gbanksnz at gmail.com
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Similarity Detector Available

Post by michiguel »

Laskos wrote:Since Miguel didn't produce the clustering diagrams for my results, I am posting the dendrograms made in SPSS.

1. For several engines we know as unrelated, and some we suspect are related. Two identical Houdinis are put to check for the noise. The distance on x-axis is important, we can empirically see that a distance more than 18-20 means unrelated, less than 18-20 suspicious of related. Noise is about 1.0-1.2. All engines at 100 ms, independently of strength.

Image

2. Adjusted for strength top engines. Houdini is at 100 ms, the rest are at larger time, adjusted for strength. The same pattern on x-axis, distance more than 18-20 means unrelated.

Image

Kai
I missed the post. I must have been flushed to the second page... In fact, this is already quite down the list already. I will get back to it (in fact, I owe Adam some analysis too).

Miguel
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Similarity Detector Available

Post by michiguel »

Graham Banks wrote:
Laskos wrote:Since Miguel didn't produce the clustering diagrams for my results, I am posting the dendrograms made in SPSS.

1. For several engines we know as unrelated, and some we suspect are related. Two identical Houdinis are put to check for the noise. The distance on x-axis is important, we can empirically see that a distance more than 18-20 means unrelated, less than 18-20 suspicious of related. Noise is about 1.0-1.2. All engines at 100 ms, independently of strength.

Image

2. Adjusted for strength top engines. Houdini is at 100 ms, the rest are at larger time, adjusted for strength. The same pattern on x-axis, distance more than 18-20 means unrelated.

Image

Kai
Hi Kai,

fot the mathematically challenged, could you please explain in clear, simple terms what your charts show?

Thanks,
Graham.
To quickly have an idea (not perfect, but useful).
Imagine it's a genealogy tree, with brothers, cousins, parent, grand parents etc. (but only showing males point of view, i.e. showing how the last name progressed ignoring females and marriages). On the left you have great-grand pa, and to the right you have the descendants. Siblings are more similar than cousins etc.

Miguel
User avatar
Graham Banks
Posts: 41473
Joined: Sun Feb 26, 2006 10:52 am
Location: Auckland, NZ

Re: Similarity Detector Available

Post by Graham Banks »

michiguel wrote: To quickly have an idea (not perfect, but useful).
Imagine it's a genealogy tree, with brothers, cousins, parent, grand parents etc. (but only showing males point of view, i.e. showing how the last name progressed ignoring females and marriages). On the left you have great-grand pa, and to the right you have the descendants. Siblings are more similar than cousins etc.

Miguel
Thanks Miguel. :)
gbanksnz at gmail.com
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Similarity Detector Available

Post by Laskos »

Graham Banks wrote:
michiguel wrote: To quickly have an idea (not perfect, but useful).
Imagine it's a genealogy tree, with brothers, cousins, parent, grand parents etc. (but only showing males point of view, i.e. showing how the last name progressed ignoring females and marriages). On the left you have great-grand pa, and to the right you have the descendants. Siblings are more similar than cousins etc.

Miguel
Thanks Miguel. :)
Graham, also, two siblings A and B

Code: Select all

A  ----------------------
            |
            |
B  ---------
are more related than C and D

Code: Select all

C  -------------------------------------------
                              |
                              |
D  ---------------------------
The same goes for cousins (going all the way to the first common ancestor)

Kai
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Similarity Detector Available

Post by Laskos »

I am posting two new dendrograms, similar to previous ones but having more engines in them.

1. Adjusted for strength top 11 engines (100 ms base for Houdini, longer times for others).

Image

Empirically, one can say that a distance larger than 18-19 on x-axis (horizontal) denotes unrelated engines. Drawing a vertical line at ~18 one can speculate that out of 11 top engines, probably only 6 are unrelated.

2. Many different engines of different, unadjusted strength (100 ms each).

Image

Here the border seems to be at ~17 on the horizontal axis. Larger than that - probably unrelated, smaller - probably related. Much smaller - probably heavily related :).

Noise is about 1.0-1.2 on horizontal axis (for Houdini, smaller than that for many other engines).

Somebody knows what's the matter with Naum42? It's closer to Rybka1 (Strelka sources) than Rybka4 is to Rybka3.

Kai
User avatar
Don
Posts: 5106
Joined: Tue Apr 29, 2008 4:27 pm

Re: Similarity Detector Available

Post by Don »

Laskos wrote:I am posting two new dendrograms, similar to previous ones but having more engines in them.

1. Adjusted for strength top 11 engines (100 ms base for Houdini, longer times for others).

Image

Empirically, one can say that a distance larger than 18-19 on x-axis (horizontal) denotes unrelated engines. Drawing a vertical line at ~18 one can speculate that out of 11 top engines, probably only 6 are unrelated.

2. Many different engines of different, unadjusted strength (100 ms each).

Image

Here the border seems to be at ~17 on the horizontal axis. Larger than that - probably unrelated, smaller - probably related. Much smaller - probably heavily related :).

Noise is about 1.0-1.2 on horizontal axis (for Houdini, smaller than that for many other engines).

Somebody knows what's the matter with Naum42? It's closer to Rybka1 (Strelka sources) than Rybka4 is to Rybka3.

Kai
It's my understanding that Naum tuned their evaluation to produce the same moves as Rybka 2.2 - I got this same result and was quite shocked until I heard the explanation.
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: Similarity Detector Available

Post by michiguel »

Laskos wrote:I am posting two new dendrograms, similar to previous ones but having more engines in them.

1. Adjusted for strength top 11 engines (100 ms base for Houdini, longer times for others).

Image

Empirically, one can say that a distance larger than 18-19 on x-axis (horizontal) denotes unrelated engines. Drawing a vertical line at ~18 one can speculate that out of 11 top engines, probably only 6 are unrelated.

2. Many different engines of different, unadjusted strength (100 ms each).

Image

Here the border seems to be at ~17 on the horizontal axis. Larger than that - probably unrelated, smaller - probably related. Much smaller - probably heavily related :).

Noise is about 1.0-1.2 on horizontal axis (for Houdini, smaller than that for many other engines).

Somebody knows what's the matter with Naum42? It's closer to Rybka1 (Strelka sources) than Rybka4 is to Rybka3.

Kai
Hi Kai,

Could you send me to my gmail account (mballicora) the files that Don's utility produces? Those are the ones that look

{engine name 1} move1 move2 move3 etc.
{engine name 2} move1 move2 move3 etc.

I have everything setup in Linux scripts and a C program to process the data, run bootstraps and plot it. I can analyze the whole thing in a minute.

Miguel
PS: Naum's author, as Don says, told Michael Hart in a private email that he fit Naum's evaluation to Rybka 2. At this point, this similarity is unquestionable. It has been observed by 4 different observers (Don, M. Hart, Adam Hair, and you) with different set of positions and conditions.
User avatar
Laskos
Posts: 10948
Joined: Wed Jul 26, 2006 10:21 pm
Full name: Kai Laskos

Re: Similarity Detector Available

Post by Laskos »

michiguel wrote:
Hi Kai,

Could you send me to my gmail account (mballicora) the files that Don's utility produces? Those are the ones that look

{engine name 1} move1 move2 move3 etc.
{engine name 2} move1 move2 move3 etc.

I have everything setup in Linux scripts and a C program to process the data, run bootstraps and plot it. I can analyze the whole thing in a minute.

Miguel
PS: Naum's author, as Don says, told Michael Hart in a private email that he fit Naum's evaluation to Rybka 2. At this point, this similarity is unquestionable. It has been observed by 4 different observers (Don, M. Hart, Adam Hair, and you) with different set of positions and conditions.
Hi Miguel, I have sent you the two *.data files from Don's utility. I am still digesting slowly explanation about Naum, my problem is not similarity with Rybka 2. Similarity even with Rybka 1.0beta is very high, and Strelka sources appeared a year or two before Naum 4.2. The second thing would be that if one can adjust his engine to be similar to another, then his engine must be potentially much stronger than that "another" (can be adjusted even further).

Kai
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Similarity Detector Available

Post by Dann Corbit »

I am unable to download it