STS test suite and engine analysis interface

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Canoike
Posts: 125
Joined: Tue Jan 17, 2012 8:08 pm

Re: STS test suite and engine analysis interface

Post by Canoike »

Thank you for your work Ferdinand. Very good tool.
I have a question about your EPD file. The thematic is not in the good order. For instance, STS 09: Advancement of a/b/c Pawns is at the end of the file. Is this wanted ?

However,
Here is the command line I use :
  • STS_Rating_v12 -f "STS1-STS15_LAN_v3.epd" -e "stockfish20150917pgo.exe" -t 1 -h 128 --proto uci --getrating
The rating is now higher. The number of physical cores is not yet well detected under Windows XP but this is not so important.

Code: Select all

Intel(R) Core(TM) i5-3450S CPU @ 2.80GHz
Physical Cores: 1, Logical Cores: 4
Engine: Stockfish 190915 64 POPCNT
Hash: 128, Threads: 1, time/pos: 0.168s
Number of positions in STS1-STS15_LAN_v3.epd: 1500
Max score = 1500 x 10 = 15000
Test duration: 00h:04m:30s
Expected time to finish: 00h:04m:57s
STS rating: 3367

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     84     73     72     78     76     76     69     69     69     78     73     64     73     68     47   1069
   Score    866    822    846    845    824    889    787    808    780    841    817    747    820    795    675  12162
Score(%)   86.6   82.2   84.6   84.5   82.4   88.9   78.7   80.8   78.0   84.1   81.7   74.7   82.0   79.5   67.5   81.1
  Rating   3613   3417   3524   3519   3426   3715   3261   3355   3230   3502   3395   3083   3408   3297   2762   3367

:: STS ID and Titles ::
STS 01: Undermining
STS 02: Open Files and Diagonals
STS 03: Knight Outposts
STS 04: Square Vacancy
STS 05: Bishop vs Knight
STS 06: Re-Capturing
STS 07: Offer of Simplification
STS 08: Advancement of f/g/h Pawns
STS 09: Advancement of a/b/c Pawns
STS 10: Simplification
STS 11: Activity of the King
STS 12: Center Control
STS 13: Pawn Play in the Center
STS 14: Queens and Rooks to the 7th rank
STS 15: Avoid Pointless Exchange

:: Top 5 STS with high result ::
1. STS 06, 88.9%, "Re-Capturing"
2. STS 01, 86.6%, "Undermining"
3. STS 03, 84.6%, "Knight Outposts"
4. STS 04, 84.5%, "Square Vacancy"
5. STS 10, 84.1%, "Simplification"

:: Top 5 STS with low result ::
1. STS 15, 67.5%, "Avoid Pointless Exchange"
2. STS 12, 74.7%, "Center Control"
3. STS 09, 78.0%, "Advancement of a/b/c Pawns"
4. STS 07, 78.7%, "Offer of Simplification"
5. STS 14, 79.5%, "Queens and Rooks to the 7th rank"
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS test suite and engine analysis interface

Post by Ferdy »

Canoike wrote:Thank you for your work Ferdinand. Very good tool.
I have a question about your EPD file. The thematic is not in the good order. For instance, STS 09: Advancement of a/b/c Pawns is at the end of the file. Is this wanted ?

However,
Here is the command line I use :
  • STS_Rating_v12 -f "STS1-STS15_LAN_v3.epd" -e "stockfish20150917pgo.exe" -t 1 -h 128 --proto uci --getrating
The rating is now higher. The number of physical cores is not yet well detected under Windows XP but this is not so important.

Code: Select all

Intel(R) Core(TM) i5-3450S CPU @ 2.80GHz
Physical Cores: 1, Logical Cores: 4
Engine: Stockfish 190915 64 POPCNT
Hash: 128, Threads: 1, time/pos: 0.168s
Number of positions in STS1-STS15_LAN_v3.epd: 1500
Max score = 1500 x 10 = 15000
Test duration: 00h:04m:30s
Expected time to finish: 00h:04m:57s
STS rating: 3367

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     84     73     72     78     76     76     69     69     69     78     73     64     73     68     47   1069
   Score    866    822    846    845    824    889    787    808    780    841    817    747    820    795    675  12162
Score(%)   86.6   82.2   84.6   84.5   82.4   88.9   78.7   80.8   78.0   84.1   81.7   74.7   82.0   79.5   67.5   81.1
  Rating   3613   3417   3524   3519   3426   3715   3261   3355   3230   3502   3395   3083   3408   3297   2762   3367

:: STS ID and Titles ::
STS 01: Undermining
STS 02: Open Files and Diagonals
STS 03: Knight Outposts
STS 04: Square Vacancy
STS 05: Bishop vs Knight
STS 06: Re-Capturing
STS 07: Offer of Simplification
STS 08: Advancement of f/g/h Pawns
STS 09: Advancement of a/b/c Pawns
STS 10: Simplification
STS 11: Activity of the King
STS 12: Center Control
STS 13: Pawn Play in the Center
STS 14: Queens and Rooks to the 7th rank
STS 15: Avoid Pointless Exchange

:: Top 5 STS with high result ::
1. STS 06, 88.9%, "Re-Capturing"
2. STS 01, 86.6%, "Undermining"
3. STS 03, 84.6%, "Knight Outposts"
4. STS 04, 84.5%, "Square Vacancy"
5. STS 10, 84.1%, "Simplification"

:: Top 5 STS with low result ::
1. STS 15, 67.5%, "Avoid Pointless Exchange"
2. STS 12, 74.7%, "Center Control"
3. STS 09, 78.0%, "Advancement of a/b/c Pawns"
4. STS 07, 78.7%, "Offer of Simplification"
5. STS 14, 79.5%, "Queens and Rooks to the 7th rank"
There was a bug in v9 that does not properly parsed the correct move in the epd v2. That is why the score was lower.
The order of epd file is fine. It can be re ordered from 1 to 15. Or any order. The parser will check the id of every epd line and save the score based on the id.
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS test suite and engine analysis interface

Post by Ferdy »

tttony wrote:Big thanks!!! Excelent tool!

Do you have a site or blog? I recommend you to create one to keep updates there
I have a site but I don't create blogs. I don't host the exe file there as it is too big.
Also the sts is done by Swaminathan and Dann. I am just creating a tool to read and let engines analyze this kind of epd format with some format changes.
User avatar
pedrox
Posts: 1056
Joined: Fri Mar 10, 2006 6:07 am
Location: Basque Country (Spain)

Re: STS test suite and engine analysis interface

Post by pedrox »

Ferdy wrote:
tttony wrote:Big thanks!!! Excelent tool!

Do you have a site or blog? I recommend you to create one to keep updates there
I have a site but I don't create blogs. I don't host the exe file there as it is too big.
Also the sts is done by Swaminathan and Dann. I am just creating a tool to read and let engines analyze this kind of epd format with some format changes.
Thanks for the tool. In my engine, Elo varies greatly from the previous version to this. Or maybe I had something broken. Now it seems that Elo is more like CCRL in my engine.

Code: Select all

AMD FX(tm)-6100 Six-Core Processor             
Physical Cores: 3, Logical Cores: 6
Engine: DanaSah 5.60
Hash: 128, Threads: 1, time/pos: 0.299s
Test duration: 00:08:43
Expected time to finish: 00:08:13
STS rating: 2092

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     51     35     53     46     61     54     30     39     31     57     42     37     56     56     18    666
   Score    581    444    608    521    650    648    381    469    394    631    515    435    624    651    313   7865
Score(%)   58.1   44.4   60.8   52.1   65.0   64.8   38.1   46.9   39.4   63.1   51.5   43.5   62.4   65.1   31.3   52.4
  Rating   2344   1734   2464   2077   2651   2642   1453   1845   1511   2567   2050   1694   2535   2656   1151   2092

Code: Select all

AMD FX(tm)-6100 Six-Core Processor             
Physical Cores: 3, Logical Cores: 6
Engine: DanaSah 5.60
Hash: 128, Threads: 1, time/pos: 0.319s
Number of positions in STS1-STS15_LAN_v3.epd: 1500
Max score = 1500 x 10 = 15000
Test duration: 00h:09m:11s
Expected time to finish: 00h:08m:43s
STS rating: 2458

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     51     41     56     44     66     54     43     40     34     60     43     45     61     61     24    723
   Score    614    536    661    546    713    779    558    523    455    681    587    575    711    739    420   9098
Score(%)   61.4   53.6   66.1   54.6   71.3   77.9   55.8   52.3   45.5   68.1   58.7   57.5   71.1   73.9   42.0   60.7
  Rating   2491   2144   2700   2188   2932   3225   2242   2086   1783   2789   2371   2317   2923   3047   1627   2458

:: STS ID and Titles ::
STS 01: Undermining
STS 02: Open Files and Diagonals
STS 03: Knight Outposts
STS 04: Square Vacancy
STS 05: Bishop vs Knight
STS 06: Re-Capturing
STS 07: Offer of Simplification
STS 08: Advancement of f/g/h Pawns
STS 09: Advancement of a/b/c Pawns
STS 10: Simplification
STS 11: Activity of the King
STS 12: Center Control
STS 13: Pawn Play in the Center
STS 14: Queens and Rooks to the 7th rank
STS 15: Avoid Pointless Exchange

:: Top 5 STS with high result ::
1. STS 06, 77.9%, "Re-Capturing"
2. STS 14, 73.9%, "Queens and Rooks to the 7th rank"
3. STS 05, 71.3%, "Bishop vs Knight"
4. STS 13, 71.1%, "Pawn Play in the Center"
5. STS 10, 68.1%, "Simplification"

:: Top 5 STS with low result ::
1. STS 15, 42.0%, "Avoid Pointless Exchange"
2. STS 09, 45.5%, "Advancement of a/b/c Pawns"
3. STS 08, 52.3%, "Advancement of f/g/h Pawns"
4. STS 02, 53.6%, "Open Files and Diagonals"
5. STS 04, 54.6%, "Square Vacancy"
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS test suite and engine analysis interface

Post by Ferdy »

pedrox wrote:
Ferdy wrote:
tttony wrote:Big thanks!!! Excelent tool!

Do you have a site or blog? I recommend you to create one to keep updates there
I have a site but I don't create blogs. I don't host the exe file there as it is too big.
Also the sts is done by Swaminathan and Dann. I am just creating a tool to read and let engines analyze this kind of epd format with some format changes.
Thanks for the tool. In my engine, Elo varies greatly from the previous version to this. Or maybe I had something broken. Now it seems that Elo is more like CCRL in my engine.

Code: Select all

AMD FX(tm)-6100 Six-Core Processor             
Physical Cores: 3, Logical Cores: 6
Engine: DanaSah 5.60
Hash: 128, Threads: 1, time/pos: 0.299s
Test duration: 00:08:43
Expected time to finish: 00:08:13
STS rating: 2092

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     51     35     53     46     61     54     30     39     31     57     42     37     56     56     18    666
   Score    581    444    608    521    650    648    381    469    394    631    515    435    624    651    313   7865
Score(%)   58.1   44.4   60.8   52.1   65.0   64.8   38.1   46.9   39.4   63.1   51.5   43.5   62.4   65.1   31.3   52.4
  Rating   2344   1734   2464   2077   2651   2642   1453   1845   1511   2567   2050   1694   2535   2656   1151   2092

Code: Select all

AMD FX(tm)-6100 Six-Core Processor             
Physical Cores: 3, Logical Cores: 6
Engine: DanaSah 5.60
Hash: 128, Threads: 1, time/pos: 0.319s
Number of positions in STS1-STS15_LAN_v3.epd: 1500
Max score = 1500 x 10 = 15000
Test duration: 00h:09m:11s
Expected time to finish: 00h:08m:43s
STS rating: 2458

  STS ID   STS1   STS2   STS3   STS4   STS5   STS6   STS7   STS8   STS9  STS10  STS11  STS12  STS13  STS14  STS15    ALL
  NumPos    100    100    100    100    100    100    100    100    100    100    100    100    100    100    100   1500
 BestCnt     51     41     56     44     66     54     43     40     34     60     43     45     61     61     24    723
   Score    614    536    661    546    713    779    558    523    455    681    587    575    711    739    420   9098
Score(%)   61.4   53.6   66.1   54.6   71.3   77.9   55.8   52.3   45.5   68.1   58.7   57.5   71.1   73.9   42.0   60.7
  Rating   2491   2144   2700   2188   2932   3225   2242   2086   1783   2789   2371   2317   2923   3047   1627   2458

:: STS ID and Titles ::
STS 01: Undermining
STS 02: Open Files and Diagonals
STS 03: Knight Outposts
STS 04: Square Vacancy
STS 05: Bishop vs Knight
STS 06: Re-Capturing
STS 07: Offer of Simplification
STS 08: Advancement of f/g/h Pawns
STS 09: Advancement of a/b/c Pawns
STS 10: Simplification
STS 11: Activity of the King
STS 12: Center Control
STS 13: Pawn Play in the Center
STS 14: Queens and Rooks to the 7th rank
STS 15: Avoid Pointless Exchange

:: Top 5 STS with high result ::
1. STS 06, 77.9%, "Re-Capturing"
2. STS 14, 73.9%, "Queens and Rooks to the 7th rank"
3. STS 05, 71.3%, "Bishop vs Knight"
4. STS 13, 71.1%, "Pawn Play in the Center"
5. STS 10, 68.1%, "Simplification"

:: Top 5 STS with low result ::
1. STS 15, 42.0%, "Avoid Pointless Exchange"
2. STS 09, 45.5%, "Advancement of a/b/c Pawns"
3. STS 08, 52.3%, "Advancement of f/g/h Pawns"
4. STS 02, 53.6%, "Open Files and Diagonals"
5. STS 04, 54.6%, "Square Vacancy"
The sts_v9 has bugs on parsing epd_v2, ignore its results. Now this is fixed in sts_v12. The sts_v12 is now using epd_v3. The epd_v3 has revisions on sts12, from pos 1 to 23, there are now more alternative moves.
The officially released sts_v3 has no problem parsing the epd included in that package.
flok

Re: STS test suite and engine analysis interface

Post by flok »

Any chance on a source-release so that it can be run on a non-exe system?
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS test suite and engine analysis interface

Post by Ferdy »

flok wrote:Any chance on a source-release so that it can be run on a non-exe system?
Try this v13.1, python src.

Code: Select all

STS Rating

Tested on Python 2.7.6, 2.7.11

v13.1
1. Remove dependency of cpu_info, cpu brand is no longer displayed.

v13
1. Also display app version in summary file
2. Modify reporting of number of cores, now no more physical
and no more logical cores, just number of cores
3. Added contempt for uci engines that supports such option
https://drive.google.com/file/d/0BwAOsu ... sp=sharing

Merry Christmas :)
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: STS test suite and engine analysis interface

Post by MikeB »

Ferdy wrote:
flok wrote:Any chance on a source-release so that it can be run on a non-exe system?
Try this v13.1, python src.

Code: Select all

STS Rating

Tested on Python 2.7.6, 2.7.11

v13.1
1. Remove dependency of cpu_info, cpu brand is no longer displayed.

v13
1. Also display app version in summary file
2. Modify reporting of number of cores, now no more physical
and no more logical cores, just number of cores
3. Added contempt for uci engines that supports such option
https://drive.google.com/file/d/0BwAOsu ... sp=sharing

Merry Christmas :)
error message

Code: Select all

Mac-Pro:sts_rating_v13.1 michaelbyrne$ python2.7 /Applications/sts_rating_v13.1/sts_rating_v13.1.py -f STS.epd -e Aristides-v1.0 --proto uci -h 128 --getrating 
STS Rating v13.1

Number of cores: 24

Engine: Aristides-v1.0
Hash: 128, Threads: 1, MoveTime: 1.0s
Number of positions in STS.epd: 1500

Your bench : 3.157117s
My bench   : 2.553400s
Analysis Time to get CCRL 40/4 rating estimate : 247ms
Starting engine Aristides-v1.0 ...
id name: Aristides v1.0 64 POPCNT


Traceback (most recent call last):
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 965, in <module>
    main&#40;sys.argv&#91;1&#58;&#93;)
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 959, in main
    stc, nmps, nSt, bSan, contempt&#41;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 434, in analyze_pos
    i = a.index&#40;"c8")
ValueError&#58; 'c8' is not in list
Ferdy
Posts: 4840
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS test suite and engine analysis interface

Post by Ferdy »

MikeB wrote:
Ferdy wrote:
flok wrote:Any chance on a source-release so that it can be run on a non-exe system?
Try this v13.1, python src.

Code: Select all

STS Rating

Tested on Python 2.7.6, 2.7.11

v13.1
1. Remove dependency of cpu_info, cpu brand is no longer displayed.

v13
1. Also display app version in summary file
2. Modify reporting of number of cores, now no more physical
and no more logical cores, just number of cores
3. Added contempt for uci engines that supports such option
https://drive.google.com/file/d/0BwAOsu ... sp=sharing

Merry Christmas :)
error message

Code: Select all

Mac-Pro&#58;sts_rating_v13.1 michaelbyrne$ python2.7 /Applications/sts_rating_v13.1/sts_rating_v13.1.py -f STS.epd -e Aristides-v1.0 --proto uci -h 128 --getrating 
STS Rating v13.1

Number of cores&#58; 24

Engine&#58; Aristides-v1.0
Hash&#58; 128, Threads&#58; 1, MoveTime&#58; 1.0s
Number of positions in STS.epd&#58; 1500

Your bench &#58; 3.157117s
My bench   &#58; 2.553400s
Analysis Time to get CCRL 40/4 rating estimate &#58; 247ms
Starting engine Aristides-v1.0 ...
id name&#58; Aristides v1.0 64 POPCNT


Traceback &#40;most recent call last&#41;&#58;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 965, in <module>
    main&#40;sys.argv&#91;1&#58;&#93;)
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 959, in main
    stc, nmps, nSt, bSan, contempt&#41;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 434, in analyze_pos
    i = a.index&#40;"c8")
ValueError&#58; 'c8' is not in list
Use,

Code: Select all

STS1-STS15_LAN_v3.epd
which is included in the uploaded file app_sts_rating_v12 from here,
http://www.talkchess.com/forum/viewtopi ... 50&t=56653
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: STS test suite and engine analysis interface

Post by MikeB »

Ferdy wrote:
MikeB wrote:
Ferdy wrote:
flok wrote:Any chance on a source-release so that it can be run on a non-exe system?
Try this v13.1, python src.

Code: Select all

STS Rating

Tested on Python 2.7.6, 2.7.11

v13.1
1. Remove dependency of cpu_info, cpu brand is no longer displayed.

v13
1. Also display app version in summary file
2. Modify reporting of number of cores, now no more physical
and no more logical cores, just number of cores
3. Added contempt for uci engines that supports such option
https://drive.google.com/file/d/0BwAOsu ... sp=sharing

Merry Christmas :)
error message

Code: Select all

Mac-Pro&#58;sts_rating_v13.1 michaelbyrne$ python2.7 /Applications/sts_rating_v13.1/sts_rating_v13.1.py -f STS.epd -e Aristides-v1.0 --proto uci -h 128 --getrating 
STS Rating v13.1

Number of cores&#58; 24

Engine&#58; Aristides-v1.0
Hash&#58; 128, Threads&#58; 1, MoveTime&#58; 1.0s
Number of positions in STS.epd&#58; 1500

Your bench &#58; 3.157117s
My bench   &#58; 2.553400s
Analysis Time to get CCRL 40/4 rating estimate &#58; 247ms
Starting engine Aristides-v1.0 ...
id name&#58; Aristides v1.0 64 POPCNT


Traceback &#40;most recent call last&#41;&#58;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 965, in <module>
    main&#40;sys.argv&#91;1&#58;&#93;)
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 959, in main
    stc, nmps, nSt, bSan, contempt&#41;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 434, in analyze_pos
    i = a.index&#40;"c8")
ValueError&#58; 'c8' is not in list
Use,

Code: Select all

STS1-STS15_LAN_v3.epd
which is included in the uploaded file app_sts_rating_v12 from here,
http://www.talkchess.com/forum/viewtopi ... 50&t=56653
get through all 1500 positions then this:

Code: Select all

Mac-Pro&#58;sts_rating_v13.1 michaelbyrne$ python2.7 /Applications/sts_rating_v13.1/sts_rating_v13.1.py -f STS1-STS15_LAN_v3.epd  -e Aristides-v1.0 --proto uci -h 128 --getrating
STS Rating v13.1

Number of cores&#58; 24

Engine&#58; Aristides-v1.0
Hash&#58; 128, Threads&#58; 1, MoveTime&#58; 1.0s
Number of positions in STS1-STS15_LAN_v3.epd&#58; 1500

Your bench &#58; 2.536852s
My bench   &#58; 2.553400s
Analysis Time to get CCRL 40/4 rating estimate &#58; 199ms
Starting engine Aristides-v1.0 ...
id name&#58; Aristides v1.0 64 POPCNT


Traceback &#40;most recent call last&#41;&#58;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 965, in <module>
    main&#40;sys.argv&#91;1&#58;&#93;)
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 959, in main
    stc, nmps, nSt, bSan, contempt&#41;
  File "/Applications/sts_rating_v13.1/sts_rating_v13.1.py", line 690, in analyze_pos
    p.communicate&#40;)
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 479, in communicate
    return self._communicate&#40;input&#41;
  File "/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 1093, in _communicate
    self.stdin.flush&#40;)
ValueError&#58; I/O operation on closed file
Mac-Pro&#58;sts_rating_v13.1 michaelbyrne$