What is the pair that is ~550 ELO different and has >60% matches?
Miguel
Moderator: Ras
What is the pair that is ~550 ELO different and has >60% matches?
Question: What is the highest similarity between two engines (OS or not) in which this number is higher than the similarity between any of these two engines with any OS one.Adam Hair wrote:ProDeo is one of the engines that I am trying to include. Many winboardJan Brouwer wrote:It may be interesting to include Rebel in your comparison because it is an engine with a detailed description of the ideas used on Ed's web pages, but with no access to the source code. If there happens to be an engine with a high similarity to Rebel, it may indicate a lower bound to the similarity achievable with copying code.Adam Hair wrote:Yes. Though, many of the other engines I have tested are actually earlier versions of engines listed above.
My main focus at this point is to see if every group of engines that tend to choose similar moves at a higher rate include an open source engine that preceded the other engines.
engines are proving difficult to test. But, I think I know a way around the
problems I am having.
If only two or more closed source engines showed a high level of similarity
it would be an interesting result that would highlight an additional limitation
of this sort of comparison. Unfortunately, every such case that I have
found so far also involves an older open source engine.
Jan Brouwer wrote: PS: there are methods which allow "cloning" of software where it can be proven to a certain degree that no copyright violation occurred, such as clean room design. It is possible that a (commercial) engine is developed by carefully observing the behaviour of a specific competitor without looking at its source code or without the source code even being available. This would almost by definition optimize the similarity of move selection without any copyright violation.
Could this happened with in the case of the Rybka clones?
Jan
I have seen and analyzed data from 4 diff. people, from diff. set of positions, and I have no doubt whatsoever in my mind that these type of tests give results that in no way are product of a coincidence. How to interpret the results may be up to debate, but an extremely high number of matches is not a random thing. Strength does not justify it.Adam Hair wrote:I can't really agree with your last statement, but I have the advantage of seeing much more data from sim testing than you have. If the majority of the engines chose the same moves at 42% to 52% in pairs but some pairs chose the same moves at 60% to 70% or more and there are enough pairs to make some normality assumptions, then we can start forming an idea about what "similarity in move selections" means. It in no way proves anything about the code for each engine, but it does give information regarding the characteristics of playing style.Dann Corbit wrote:I agree, and I am also not sure what will happen when you test various engines against themselves. But I think that the experiment will be very valuable as a control. If we do not run the experiment, then we have no idea how an identical engine would perform. Without that information, I don't think we have any real idea what "these engines make similar moves" even means.Laskos wrote:This is a worn out discussion several months old which I had with Miguel, strange you missed it. Yes, there is a correlation with the strength, there is also a correlation of self-similarity and even plain similarity with the time control. Adam is using time control adjustments to compensate for strength differences, using the formulaDann Corbit wrote:The reason I posed that response is that I suspect super GMs will make similar moves to each other and GMs will make similar moves to each other and IMs will make similar moves to each other.CRoberson wrote:It is not strength that matters in how deterministic a program is. You sent me email sometime around 2000 give or take a year about how deterministic NoonianChess is. While it is very deterministic, it is not very strong compared to Rybka and the like.Dann Corbit wrote: I would also like to see a correlation based upon strength. For instance, suppose that engine A is 50 Elo stronger than engine B... Then run an experiment where A has one thread and B has two threads (or some similar way to try to match impedence). Perhaps correlations are *largely* a function of strength, and perhaps not.
Suppose there is some very difficult and esoteric move in a position that arises in a game. Paul Morphy finds it, Capablanca finds it, Kasparov finds it. But maybe other {slightly lesser} players won't.
Consider fairly difficult test sets... The best engines get similar (good) scores. So I do not think we can rule out similarity of moves chosen as a basis of strength (at least to some degree) until it is tested.
Time = 10ms * 2^(Elo diff/100)
I have only two small objections:
1. 10ms for Houdini 1.5 seems to me too little. On Windows the standard C clock() function only has a resolution of 16ms.
2. 100 Elo points per doubling seems a little too much, maybe 70-80.
These things are pretty much irrelevant, the main thing is interpreting the results.
Kai
That would be Philou 3.5.1 and Stockfish.michiguel wrote:What is the pair that is ~550 ELO different and has >60% matches?
Miguel
I will do that this weekend. I do know that at 2 secs per position, Toga 1.41 SE had a self similarity of 98.03%.Dann Corbit wrote:Have you done a Toga verses Toga similarity test?
The reason I ask is that Fruit verses Fruit gives extremely high similarities (95% IIRC) and I wonder if a definite Fruit clone like Toga has a highly similar profile.
Code: Select all
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 1
sim version 3
------ Grapefruit 1.0 (time: 100 ms scale: 1.0) ------
98.01 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
60.88 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
60.85 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
60.69 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
60.66 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
60.60 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.50 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 2
sim version 3
------ Grapefruit 1.0_ (time: 100 ms scale: 1.0) ------
98.01 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.69 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
60.68 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
60.65 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
60.63 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
60.37 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.32 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 3
sim version 3
------ Toga II 1.3 Beta1 (time: 100 ms scale: 1.0) ------
98.57 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
79.12 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
78.62 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
64.86 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
64.65 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.66 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.65 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 4
sim version 3
------ Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0) ------
98.57 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
79.07 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
78.65 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
65.04 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
64.83 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.69 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
60.69 Grapefruit 1.0 (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 5
sim version 3
------ Toga II 1.4.1SE (time: 100 ms scale: 1.0) ------
98.15 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
65.38 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
65.10 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
64.83 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
64.65 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
60.60 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.37 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 6
sim version 3
------ Toga II 1.4.1SE_ (time: 100 ms scale: 1.0) ------
98.15 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
65.61 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
65.26 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
65.04 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
64.86 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
60.50 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.32 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 7
sim version 3
------ Toga II 3.1.2SE JA (time: 100 ms scale: 1.0) ------
98.09 Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0)
78.65 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
78.62 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
65.61 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
65.38 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.88 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.68 Grapefruit 1.0_ (time: 100 ms scale: 1.0)
C:\2sec_1\Similarity_Tests(01_13_2011)>sim03w64.exe -r 8
sim version 3
------ Toga II 3.1.2SE JA_ (time: 100 ms scale: 1.0) ------
98.09 Toga II 3.1.2SE JA (time: 100 ms scale: 1.0)
79.12 Toga II 1.3 Beta1 (time: 100 ms scale: 1.0)
79.07 Toga II 1.3 Beta1_ (time: 100 ms scale: 1.0)
65.26 Toga II 1.4.1SE_ (time: 100 ms scale: 1.0)
65.10 Toga II 1.4.1SE (time: 100 ms scale: 1.0)
60.85 Grapefruit 1.0 (time: 100 ms scale: 1.0)
60.63 Grapefruit 1.0_ (time: 100 ms scale: 1.0)