Similarity tests

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Similarity tests

Post by Adam Hair »

Similarity percentages that low are as unexpected as high percentages. In the case of Delfi, there is an explanation.

Code: Select all

1412994602.609 GUI->Adapter: ucinewgame
1412994602.609 Adapter->Engine: ucinewgame
1412994602.609 GUI->Adapter: isready
1412994602.609 Adapter->Engine: isready
1412994602.640 Engine->Adapter: readyok
1412994602.640 Adapter->GUI: readyok
1412994602.640 GUI->Adapter: position startpos moves d2d4 g8f6 g1f3 g7g6 g2g3 f8g7 f1g2 e8g8 e1g1 d7d6 f1e1 b8c6 e2e4 e7e5 c2c3 c8d7
1412994602.640 Adapter->Engine: position startpos moves d2d4 g8f6 g1f3 g7g6 g2g3 f8g7 f1g2 e8g8 e1g1 d7d6 f1e1 b8c6 e2e4 e7e5 c2c3 c8d7
1412994602.640 GUI->Adapter: go depth 50
1412994602.640 Adapter->Engine: go depth 50
1412994602.671 Engine->Adapter: bestmove b1a3
1412994602.671 Adapter->GUI: bestmove b1a3
1412994603.546 GUI->Adapter: stop
1412994603.546 Adapter->Engine: stop
When commanded to search a finite depth, Delfi 5.4 immediately (after a 30 to 50 millisecond delay) sends its best move.

However, Delfi performs correctly (except for that delay) when commanded to "go infinite".

Code: Select all

1412994834.140 GUI->Adapter: ucinewgame
1412994834.140 Adapter->Engine: ucinewgame
1412994834.140 GUI->Adapter: isready
1412994834.140 Adapter->Engine: isready
1412994834.171 Engine->Adapter: readyok
1412994834.171 Adapter->GUI: readyok
1412994834.171 GUI->Adapter: position startpos moves g1f3 g8f6 c2c4 e7e6 b1c3 d7d5 d2d4 f8e7 c1g5 e8g8 e2e3 h7h6 g5h4
1412994834.171 Adapter->Engine: position startpos moves g1f3 g8f6 c2c4 e7e6 b1c3 d7d5 d2d4 f8e7 c1g5 e8g8 e2e3 h7h6 g5h4
1412994834.171 GUI->Adapter: go infinite
1412994834.171 Adapter->Engine: go infinite
1412994834.203 Engine->Adapter: info depth 2 score cp 27 time 0 nodes 946 pv b8c6 f1d3 d5c4 d3c4
1412994834.203 Adapter->GUI: info depth 2 score cp 27 time 0 nodes 946 pv b8c6 f1d3 d5c4 d3c4
1412994834.234 Engine->Adapter: info depth 3 score cp 26 time 0 nodes 1913 pv b8c6 d1b3 d5c4 f1c4
1412994834.234 Adapter->GUI: info depth 3 score cp 26 time 0 nodes 1913 pv b8c6 d1b3 d5c4 f1c4
1412994834.234 Engine->Adapter: info depth 4 score cp 26 time 0 nodes 3873 pv b8c6 d1b3 d5c4 f1c4
1412994834.234 Adapter->GUI: info depth 4 score cp 26 time 0 nodes 3873 pv b8c6 d1b3 d5c4 f1c4
1412994834.234 Engine->Adapter: info depth 5 score cp 24 time 0 nodes 14520 pv b8c6 f1d3 d5c4 d3c4 c8d7
1412994834.234 Adapter->GUI: info depth 5 score cp 24 time 0 nodes 14520 pv b8c6 f1d3 d5c4 d3c4 c8d7
1412994834.234 Engine->Adapter: info depth 6 score cp 4 time 16 nodes 31955 pv b8c6 f1d3 d5c4 d3c4 c6a5 c4d3
1412994834.234 Adapter->GUI: info depth 6 score cp 4 time 16 nodes 31955 pv b8c6 f1d3 d5c4 d3c4 c6a5 c4d3
1412994834.296 Engine->Adapter: info depth 7 score cp 5 time 94 nodes 126099 pv b8c6 f1e2 c6a5 c4d5 f6d5 h4e7 d8e7
1412994834.296 Adapter->GUI: info depth 7 score cp 5 time 94 nodes 126099 pv b8c6 f1e2 c6a5 c4d5 f6d5 h4e7 d8e7
1412994834.421 Engine->Adapter: info depth 8 score cp -13 time 219 nodes 304459 pv b8c6 f1e2 c6a5 c4d5 e6d5 e1g1 a5c4 h4f6 e7f6 e2c4 d5c4
1412994834.421 Adapter->GUI: info depth 8 score cp -13 time 219 nodes 304459 pv b8c6 f1e2 c6a5 c4d5 e6d5 e1g1 a5c4 h4f6 e7f6 e2c4 d5c4
1412994834.718 Engine->Adapter: info depth 9 score cp -14 time 516 nodes 706027 pv b8c6 f1d3 d5c4 d3c4 c6a5 c4d3 f6d5 h4e7 d8e7
1412994834.718 Adapter->GUI: info depth 9 score cp -14 time 516 nodes 706027 pv b8c6 f1d3 d5c4 d3c4 c6a5 c4d3 f6d5 h4e7 d8e7
1412994835.109 GUI->Adapter: stop
1412994835.109 Adapter->Engine: stop
1412994835.109 Engine->Adapter: bestmove b8c6 ponder f1d3
1412994835.109 Adapter->GUI: bestmove b8c6 ponder f1d3
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Similarity tests

Post by Sedat Canbaz »

Thank you for the useful test dear Adam!

Interesting to note,
A few months ago I could not run sim test by Delfi 5.4, but this time I managed...
I can't remember... I included all files or not ...maybe this can be the reason ?!

Best,
Sedat
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Similarity tests

Post by Adam Hair »

Sedat Canbaz wrote:Thank you for the useful test dear Adam!

Interesting to note,
A few months ago I could not run sim test by Delfi 5.4, but this time I managed...
I can't remember... I included all files or not ...maybe this can be the reason ?!

Best,
Sedat
Delfi will not run without its .dat and .ini files. But the cause for the low similarity percentages is that Delfi does not respond correctly to the command "go depth 50". The sim tool has to be modified to send the command "go infinite".

Here is a comparison of Delfi with the other engines I have tested so far:

sim version 3
------ Delfi 5.4 (time: 933 ms scale: 1.0) ------
58.22 Fruit 2.1 (time: 937 ms scale: 1.0)
51.25 Naum 2.0 (time: 854 ms scale: 1.0)
51.17 Delphil 3.2 (time: 3880 ms scale: 1.0)
51.15 cheng4 0.36c (time: 376 ms scale: 1.0)
51.07 Ktulu 8 (time: 651 ms scale: 1.0)
50.95 Rybka 3 1-cpu (time: 77 ms scale: 1.0)
50.22 Ares 1.005-64W (time: 3726 ms scale: 1.0)
49.85 Booot 5.2.0(64) (time: 205 ms scale: 1.0)
49.67 Atlas 3.70em (time: 640 ms scale: 1.0)
49.13 Andscacs 0.64n (time: 541 ms scale: 1.0)
47.23 Gaviota v1.0 (time: 251 ms scale: 1.0)
46.64 Chess Tiger 2007.1 (time: 766 ms scale: 1.0)
46.61 Chiron 2 64bit (time: 84 ms scale: 1.0)
43.37 Arminius 2014-01-18 (time: 2178 ms scale: 1.0)


The search time used by each engine is given by 20ms * 2^((3156-CEGT 40/4 rating)/120), where 3156 is Stockfish 5's 1 CPU CEGT rating (computed with Ordo). This adjustment roughly accounts for the differences in search speed and gives a better measurement of the evaluation similarities.

I have uploaded a folder to Mediafire that contains modified versions (the command "ucinewgame" is sent for each move) of the sim tool - sim03w64.exe and sim03w64_infinite.exe. The first sends the command "go depth 50", the second sends "go infinite". Unfortunately, some engines will not obey the "stop" command when sent "go infinite", so both versions of the tool are necessary. Also, I recommend using Polyglot (or wb2uci for WB engines) as an adapter so that logs can be saved (if necessary) and studied to see if an engine performs the test correctly. By doing this I found that Chess Tiger, Ktulu, Delfi, Booot, Ares, Andscacs, and Deuterium all perform the sim test correctly (or close to correct) when sent "go infinite" instead of "go depth 50".

Here is the folder: http://www.mediafire.com/download/jz75x ... m03w64.rar
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Similarity tests

Post by Sedat Canbaz »

Yes, I noticed that...Delfi needs its own files...

Btw, I noticed also your sim results are not same as mine...

Probably due to our hardwares are not same...

Another reason can be,
Delfi doesn't not run properly via sim tool

Best,
Sedat
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Similarity tests

Post by Adam Hair »

Sedat Canbaz wrote:Yes, I noticed that...Delfi needs its own files...

Btw, I noticed also your sim results are not same as mine...

Probably due to our hardwares are not same...
The numbers differ for two reasons:

1) Our hardware are different
2) Instead of using 100 ms for each engine, I am varying the search time in relation to its Elo difference from Stockfish 5.
Sedat Canbaz wrote: Another reason can be,
Delfi doesn't not run properly via sim tool

Best,
Sedat
Delfi does run the sim test close to correct when the command that makes it start searching is changed from "go depth 50" to "go infinite".
Ralf Müller
Posts: 127
Joined: Sat Dec 29, 2012 12:07 am

Re: Similarity tests

Post by Ralf Müller »

Could anyone provide a file of "similarity.data" with results from recent engines? Last file I have is from January 2014 by Adam Hair (http://talkchess.com/forum/viewtopic.ph ... =&start=10)
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Similarity tests

Post by Adam Hair »

Ralf Müller wrote:Could anyone provide a file of "similarity.data" with results from recent engines? Last file I have is from January 2014 by Adam Hair (http://talkchess.com/forum/viewtopic.ph ... =&start=10)
I am in the process of collecting new data, and I will share my results in the near future.
Ralf Müller
Posts: 127
Joined: Sat Dec 29, 2012 12:07 am

Re: Similarity tests

Post by Ralf Müller »

Great! Many thanks in advance!
Sedat Canbaz
Posts: 3018
Joined: Thu Mar 09, 2006 11:58 am
Location: Antalya/Turkey

Re: Similarity tests

Post by Sedat Canbaz »

Dear Adam,

Let me please to explain you a few notes more regarding Sim tool utility

I see Sim tool as Chess GUIs !

It's true that some chess engines suffer under Sim tool,
but overall for UCI engines, Sim tools is almost perfect !!!

And let's take as examples,
ChessBase; Shredder Classic; Arena or any other rest GUIs
Each Chess GUI has own advantages/disadvantages
I mean some engines also do not play at full performance...I noticed that, what about you ?)
And plus, I can say one of the main Fritz advantages over rest GUIs:
*Fritz book editing options are just great !!!
-I can't see any better GUI than Fritz...this should be known too!

And I am not surprised in that too,
-The World strongest book makers are usually ctg book creators!

A simple example,
For example...let's organize a book tournament for non ctg users,
Then I expect the number of SCCT participants to be fall dramatically down
Probably we can see 70-80 % less book participants....

In other words,
The great Don's tool is not perfect, but the best utility in case of recognizing...much better than many human experts!!


Best,
Sedat
Modern Times
Posts: 3744
Joined: Thu Jun 07, 2012 11:02 pm

Re: Similarity tests

Post by Modern Times »

Sedat Canbaz wrote:-I can't see any better GUI than Fritz...this should be known too!
I never use Chessbase GUI by choice, only if I have to. It is my least favourite GUI.