Need STS data

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

CRoberson
Posts: 2094
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Need STS data

Post by CRoberson »

Swami,

I'd like to do some analysis on the STS results you have from various programs and their ability to predict playing strength.
I don't believe it will be highly successful, because you don't test many things that are important for computer chess. Things like: pondering,
timing algorithms, .... Especially, timing algorithms can make a big difference.

So, I'd like the data from as many programs as you have on the same hardware under the same conditions. Preferably, all in one file.
Format should be similar to this:
Name & version score on testset1 testset2 ........ testset7

Only raw scores needed not percentages.

My planned analysis will probe for correlations in overall score as well as individual tests and combination scores. I hope to find
which tests are the most predictive or which combination of tests are the most predictive.
swami
Posts: 6663
Joined: Thu Mar 09, 2006 4:21 am

Re: Need STS data

Post by swami »

Hi Charles,

I will certainly do tests, collect results, and enter them in a website. But I can only start testing on 21st of December. Right now, I have a lots of other things to get myself busy with. I have about 2 weeks holiday starting from December 23rd until January first week. I plan to do tests on as maximum number of engines (>250 or so) as possible.

The goal is to determine the engine's knowledge on specific strategic theme. Surprisingly the results so far seem to closely correlate with the engine's actual playing strength.

Not sure what you meant by timing Algorithms, is it implemented within an engine? I'm also interested in statistics and distribution and your perspective based on the data. I and Dann are currently working on 8th version called "Advancement of King Side Pawn Cover". Would probably release it in months time.

Will certainly send you the data on January first week if that's not too late.

I will make the complete data publicly available as well as inform and remind you via email.

Regards,
Swami
swami
Posts: 6663
Joined: Thu Mar 09, 2006 4:21 am

Re: Need STS data

Post by swami »

BTW, In case if you want data on very limited number of engines, I have done the tests on Division 2 and Division 3 for STS v1 to STS v6

You can get the excel file for the results of Division 2 engines here:

http://sites.google.com/site/strategict ... ects=0&d=1

Also, total results of Division 3 engines can be found here:

http://www.talkchess.com/forum/viewtopi ... 5&start=10

The results of latest updates of various engines can be obtained by doing a search on this forum with the keyword "sts"
Jouni
Posts: 3722
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: Need STS data

Post by Jouni »

I was lazy and tested about 10 top engines only. The best correlation to CEGT/CCRL ratings was STS1-3... But note, that as always in test suites number of solutions isn't the best way to measure, but total time used!
One example in STS3:

Fritz 10-128MB Stockfish 1.5.1 JA-128MB
0:26:45 0:25:30
83 78


Which one is better :wink: ? When You use time you get also more precision to calculations with 4-5 numbers.

Jouni
swami
Posts: 6663
Joined: Thu Mar 09, 2006 4:21 am

Re: Need STS data

Post by swami »

Not many test suites have an ability to predict even the rough strength of the chess engines.

WAC, WSAC, PET, LAPUCE and such are popular test sets in tactics but as you can see many engines score nearly 290/300 in these suites
therefore only the working knowledge of engines in basic tactics is determined.
They can't be used for the comparison purposes.
They can't be used to determine the elo strength of the engine.


Factors:

= STS tests only engine's strategical strength. Not tactical. Therefore you can't compare with the standard elo.
= 7 STS tests are not enough. We need 25 - 30 STS for more accuracy.

within the next 2 to 3 years time, we will have about 25-30 STS test suites. Engine's score out of 3000 positions will give you near accurate strength _in strategy_ I believe.
CRoberson
Posts: 2094
Joined: Mon Mar 13, 2006 2:31 am
Location: North Carolina, USA

Re: Need STS data

Post by CRoberson »

swami wrote:Hi Charles,

I will certainly do tests, collect results, and enter them in a website. But I can only start testing on 21st of December. Right now, I have a lots of other things to get myself busy with. I have about 2 weeks holiday starting from December 23rd until January first week. I plan to do tests on as maximum number of engines (>250 or so) as possible.

The goal is to determine the engine's knowledge on specific strategic theme. Surprisingly the results so far seem to closely correlate with the engine's actual playing strength.

Not sure what you meant by timing Algorithms, is it implemented within an engine? I'm also interested in statistics and distribution and your perspective based on the data. I and Dann are currently working on 8th version called "Advancement of King Side Pawn Cover". Would probably release it in months time.

Will certainly send you the data on January first week if that's not too late.

I will make the complete data publicly available as well as inform and remind you via email.

Regards,
Swami
Here is what I meant when talking about timing algorithms and such.

To get a perfect test that correlates with the playing strengths, one must test every feature of a chess program that is used during
a match. Test suites do test the search and eval, but they don't test the quality of things like the timing algorithms. Without testing
everything, you can't guarantee an accurate correlation. Thus, it could be used to catch a few clones that copy all then purposely
botch the timing algorithm to hide the cloning.