STS - List the Order of Importance

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

swami
Posts: 6640
Joined: Thu Mar 09, 2006 4:21 am

Re: STS - List the Order of Importance

Post by swami »

mcostalba wrote:
Edmund wrote:

Code: Select all

       sts60 |  -2.137339   2.555934    -0.84   0.409    -7.321013    3.046336
I think the coefficients should be constrained to be positive, otherwise it means that the higher score get an engine in STS 6 the weaker it is :shock:
If we assume that value to be positive then STS 6.0 would rank as 6th important test suite. STS 3.0 being the least important as I had nearly guessed.
noctiferus
Posts: 364
Joined: Sun Oct 04, 2009 1:27 pm
Location: Italy

Re: STS - List the Order of Importance

Post by noctiferus »

The problem is that this is a point estimate. If you look at the confidence interval, you see no reason to believe that it is not 0 , or some small positive value.
let's see on this site 15-16 hours from now
Edmund
Posts: 670
Joined: Mon Dec 03, 2007 3:01 pm
Location: Barcelona, Spain

Re: STS - List the Order of Importance

Post by Edmund »

Another approach .. this time some quick work with excel

I uploaded the diagrams to: http://yfrog.com/2m20297269gx

for each test suit I ploted a graph with points for each score/elo pair.
Then I added a linear trendline.

Fortunatly all slopes of the trendlines were positive (definitly my last results were more misleading in that sense)

The slope of the tendlines should give an approximate rating on how well a certain testsuit is capable of presenting the engines strength.

Here the formulas:
STS 5: y = 0.0410x - 33.542
STS 8: y = 0.0407x - 58.458
STS 6: y = 0.0365x - 29.026
STS 4: y = 0.0357x - 28.025
STS 7: y = 0.0302x - 17.95
STS 1: y = 0.0239x + 1.7925
STS 3: y = 0.0236x + 2.6652
STS 2: y = 0.0155x + 28.274
Uri Blass
Posts: 10282
Joined: Thu Mar 09, 2006 12:37 am
Location: Tel-Aviv Israel

Re: STS - List the Order of Importance

Post by Uri Blass »

swami wrote:This is the excel file with results of nearly 50+ engines in case if anyone else is interested:

http://sites.google.com/site/strategict ... STS1-8.xls

PS: To Carlos, can you sort it in total scores order and make a Gif image of it and post it here? Thanks!
looking at the results I see that Pharaon scored better than Bison in everyone of the tests when Bison has clearly higher rating (almost 100 elo difference) so your task should be to develop a test that is not based on games when Bison scores better than Pharaon(because I do not believe that Bison is better than Pharaon only because of factors like better time management.

Uri
swami
Posts: 6640
Joined: Thu Mar 09, 2006 4:21 am

Re: STS - List the Order of Importance

Post by swami »

Uri Blass wrote:
swami wrote:This is the excel file with results of nearly 50+ engines in case if anyone else is interested:

http://sites.google.com/site/strategict ... STS1-8.xls

PS: To Carlos, can you sort it in total scores order and make a Gif image of it and post it here? Thanks!
looking at the results I see that Pharaon scored better than Bison in everyone of the tests when Bison has clearly higher rating (almost 100 elo difference) so your task should be to develop a test that is not based on games when Bison scores better than Pharaon(because I do not believe that Bison is better than Pharaon only because of factors like better time management.

Uri
This is one of the rare exceptions. I don't think Bison is very well suited to this time control. The case maybe that it does constantly change the move while searching, within the 10 seconds span for the certain position or it sticks with the same move for too long without looking for an alternative and evaluating better scores for it.

As for other engines, the score given in STS is nearly similar to the ratings level. Of course, It's difficult to differentiate between set of 10 engines when they're so close in strength. I hope with the help of more test suites, it will be easier.
swami
Posts: 6640
Joined: Thu Mar 09, 2006 4:21 am

Re: STS - List the Order of Importance

Post by swami »

Uri Blass wrote:
swami wrote:This is the excel file with results of nearly 50+ engines in case if anyone else is interested:

http://sites.google.com/site/strategict ... STS1-8.xls

PS: To Carlos, can you sort it in total scores order and make a Gif image of it and post it here? Thanks!
looking at the results I see that Pharaon scored better than Bison in everyone of the tests when Bison has clearly higher rating (almost 100 elo difference) so your task should be to develop a test that is not based on games when Bison scores better than Pharaon(because I do not believe that Bison is better than Pharaon only because of factors like better time management.

Uri
I forgot to update Bison to 9.8.

I used Bison 9.6a for this test.

Now will test Bison 9.8 and see what the results are.
Carlos777
Posts: 1731
Joined: Sun Dec 13, 2009 6:09 pm

Re: STS - List the Order of Importance

Post by Carlos777 »

swami wrote: I forgot to update Bison to 9.8.

I used Bison 9.6a for this test.

Now will test Bison 9.8 and see what the results are.
Hi Swami,

Why not Bison 9.11?

Best,
Carlos
jwes
Posts: 778
Joined: Sat Jul 01, 2006 7:11 am

Re: STS - List the Order of Importance

Post by jwes »

swami wrote:
Edmund wrote:Thats what I am getting after a linear regression:
r² = 0.6577
sqr(mse) = 68.912

sts10-80 are the coefficients
cons is the constant
the result is the elo in the CEGT scale

Code: Select all

------------------------------------------------------------------------------
         elo |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       sts10 |   2.272232    2.96261     0.77   0.448    -3.736218    8.280683
       sts20 |    3.20341    3.01614     1.06   0.295    -2.913606    9.320425
       sts30 |   .4575349   2.331547     0.20   0.846    -4.271063    5.186132
       sts40 |   2.358127   2.152502     1.10   0.281     -2.00735    6.723604
       sts50 |   7.583491   3.162585     2.40   0.022      1.16947    13.99751
       sts60 |  -2.137339   2.555934    -0.84   0.409    -7.321013    3.046336
       sts70 |   .8115352   2.652622     0.31   0.761    -4.568232    6.191302
       sts80 |   4.939367    2.18094     2.26   0.030      .516216    9.362519
       _cons |   1342.862   209.7266     6.40   0.000     917.5169    1768.207
------------------------------------------------------------------------------
So, according to this piece of information, one could conclude that STS 6.0 is the least important of all test suites which even has a negative co-efficient and that it is better not to do well in it? Looks little confusing and probably not true.

Also, with highest co-efficient value for STS 5.0 followed by STS 8.0 (?!) indicates they are the two most important according to this regression.

So this data gives the ranks for order of importance, it's something like:

STS 5
STS 8
STS 2
STS 4
STS 1
STS 7
STS 3
STS 6

Not a bad try at all. Since I expect all the middle ranks to be in right place except STS 6 and 8.
I ran another simple test. I took the correlations between the sts tests and the various computer rankings (I could not find a good way to combine the ratings though I did not try very hard) and got these correlation coefficients:

Code: Select all

STS  	1.0	 2.0	 3.0	 4.0	 5.0	 6.0	 7.0	 8.0	Total
WBEC	0.51	0.31	0.33	0.65	0.73	0.56	0.63	0.71	0.75
CEGT	0.53	0.40	0.51	0.45	0.60	0.48	0.47	0.59	0.72
CCRL	0.42	0.26	0.32	0.58	0.72	0.51	0.54	0.59	0.72
which would rank the tests in this order:
STS 5.0
STS 8.0
STS 4.0
STS 7.0
STS 6.0
STS 1.0
STS 3.0
STS 2.0
with STS 3.0 and STS 2.0 significantly less correlated than the others.
Would it be possible for you to upload a file that has the results for each position in each test for each engine? Then I could run some statistics to see how well each position is correlated.
Carlos777
Posts: 1731
Joined: Sun Dec 13, 2009 6:09 pm

Re: STS - List the Order of Importance

Post by Carlos777 »

swami wrote:This is the excel file with results of nearly 50+ engines in case if anyone else is interested:

http://sites.google.com/site/strategict ... STS1-8.xls

PS: To Carlos, can you sort it in total scores order and make a Gif image of it and post it here? Thanks!
I just noticed this. I hope you can see it.

Image

Carlos.
Jouni
Posts: 3286
Joined: Wed Mar 08, 2006 8:15 pm

Re: STS - List the Order of Importance

Post by Jouni »

I have only tested top engines. One problem with STS: Naum4 scores clealy better (20-30 more) than Stockfish 1.6! So I quess the reason is, that positions are checked only(?) with R3 and N4 so You cannot use suite
to test 2 very best engines, what's a pity...

Jouni