STS re-re-re-re-re-visited

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

peter
Posts: 3364
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: STS re-re-re-re-re-visited

Post by peter »

peter wrote: Thu Nov 03, 2022 12:19 pm
peter wrote: Wed Nov 02, 2022 10:07 am ...
1rb1qrk1/2b2pp1/p3pBn1/3pP1Pp/1ppP4/2P1QN2/PP3P1P/R2BR1K1 w - - bm Nh4; c0 "Nh4=100"; id "LC0-SF, HTC108-15";
...
Deleted this last one again, it doesn't match the 309 others as for difficulty for all A-B-engines, so it favours LC0 too much decidedly.
For short to very short TC there must not be so much difference in selectivity of single positions, stopped adding such out of HTC and comparable difficulty, will rather add yet some not too easy ones out of STS, regards
Even took out the hardest ones from Eret and Arasan again too, 13 that couldn't be solved from any engine with STC, no more 100 points- positions left, 75 is max. now. Eret on average 25 points for best moves., Arasan 15, 594 STS- postions that are usable more or less as single best moves too, at least all the too easy ones left out, average points for remaining 594 STS- positions for best moves10 points, all in all 888 postions, downloadable here:

https://www.dropbox.com/s/1m3cnrnqtq01q ... 8.epd?dl=0

These are of a common level of difficulty as for TCs between 100 and 500 msec. First list of a broader range of hardware- TCs of some of the strongest engines to see the "scaling" of the test (GD means GoldDiggger- option of ShashChess, nice to see, that it helps with more hardware- time and costs with much lesser one), 16x3.5GHz CPU, 3070ti GPU:

Image

SF15 was run single threaded twice with 100msec. to see the reproducibility of the result, regards
Peter.
peter
Posts: 3364
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: STS re-re-re-re-re-visited

Post by peter »

peter wrote: Sat Nov 05, 2022 10:07 am These are of a common level of difficulty as for TCs between 100 and 500 msec.
Yet it can be used without MEA- tool as single best move suite too with TCs not much longer than 1"/pos.
Such with 30 threads of 16x3.5GHz CPU and EloStatTS:

Code: Select all


    Program                                    Elo   +/-  Matches  Score   Av.Op.   S.Pos.   MST1    MST2   RIndex

  1 Crystal5KWK                              : 3500    2    852    50.0 %   3500   844/888    1.0s    1.0s   0.99
  2 ShashChess25.2-GoldDigger                : 3500    2    852    50.0 %   3500   847/888    1.0s    1.0s   0.99



MST1  : Mean solution time (solved positions only)
MST2  : Mean solution time (solved and unsolved positions)
RIndex: Score according to solution time ranking for each position
Less discrimination but (probably, the rating by MEA- points enlarges the Elo- gaps but does so with error bar of course too) only two runs of two (as for the test) very near to each other engines.
With MEA much bigger difference (with same positions but 200msec and 8 threads only, which does some expansion of numeric results too of course):

Image

And then there has to be seen main point of the positions as such: move ordering is done mainly out of "static eval" with very little to almost no search- depth. "Tactical" abilities on their own aren't testet like that, for that more difficult positions with longer TC are more selective and sensitive.
E.g. those 250 (still mixed rather difficult with not so difficult ones)

https://www.dropbox.com/s/lpg29zoyvh03dza/256.epd?dl=0

with same 30 threads and 5"/pos., MV4 meaning MultiPV=4:

Code: Select all


    Program                                    Elo   +/-  Matches  Score   Av.Op.   S.Pos.   MST1    MST2   RIndex

  1 HypnoSFmpv210922-Set1-ImbInv             : 3547    3   5348    57.3 %   3496   206/256    1.8s    2.4s   0.74
  2 Crystal5KWK-MV4                          : 3540    3   5239    56.3 %   3496   196/256    1.7s    2.5s   0.71
  3 BlueMarlin15.3-MV4                       : 3539    3   5137    56.0 %   3497   192/256    1.7s    2.5s   0.73
  4 BlueMarlin15.4-avx2-MV4                  : 3537    4   5066    55.8 %   3497   189/256    1.6s    2.5s   0.74
  5 ShashChess25-GD-MV4                      : 3537    3   5150    55.8 %   3496   194/256    1.8s    2.6s   0.72
  6 ShashChess25.2-GoldDigger-MV4            : 3534    3   5092    55.4 %   3497   191/256    1.8s    2.6s   0.72
  7 ShashChess24-MV4                         : 3533    4   5024    55.2 %   3497   189/256    1.7s    2.6s   0.73
  8 ShashChess25.1-GD-MCTS-MV4               : 3533    3   5135    55.1 %   3497   193/256    1.9s    2.7s   0.69
  9 ShashChess25-GD-HT-HP-MV4                : 3532    3   5037    55.1 %   3497   188/256    1.7s    2.6s   0.73
 10 CorChess3300522-MV4                      : 3532    4   5075    55.0 %   3497   189/256    1.8s    2.6s   0.68
 11 ShashChess25.3-GoldDigger                : 3528    4   5002    54.5 %   3497   188/256    1.9s    2.7s   0.70
 12 EMAN8.40-Tact.7-Expl.12-MV4              : 3527    4   4987    54.3 %   3497   183/256    1.7s    2.7s   0.69
 13 Stockfish110922-MV4                      : 3524    4   4943    53.8 %   3498   178/256    1.7s    2.7s   0.68
 14 Stockfish231022-MV4                      : 3523    4   4949    53.6 %   3498   180/256    1.8s    2.8s   0.68
 15 EMAN8.30-MV4                             : 3521    4   4874    53.3 %   3498   178/256    1.8s    2.8s   0.66
 16 Stockfish110922                          : 3507    4   4804    51.0 %   3500   160/256    1.7s    2.9s   0.63
 17 Dragon3.1byKomodoChess-MV4               : 3495    4   4752    49.3 %   3501   152/256    1.8s    3.1s   0.56
 18 Berserk10-MV4                            : 3464    4   4480    44.3 %   3504   124/256    1.8s    3.4s   0.46
 19 Koivisto8.16                             : 3452    5   4456    42.4 %   3505   113/256    1.8s    3.6s   0.42
 20 Ceres0.97RC3-784990                      : 3452    5   4567    42.1 %   3507   112/256    1.9s    3.6s   0.34
 21 TheHuntsman1bmi2-MV4                     : 3451    5   4720    42.4 %   3504   109/256    1.6s    3.6s   0.34
 22 Lc0v0.29.0-rc0-805874                    : 3449    5   4527    41.8 %   3507   109/256    1.7s    3.6s   0.37
 23 Lc0v0.29.0-rc0-784968                    : 3432    5   4474    39.2 %   3508    96/256    1.9s    3.8s   0.31
 24 Lc0v0.30.0-dag+git.c91bf77-784968        : 3417    5   4345    37.0 %   3509    86/256    1.9s    3.9s   0.28
 25 PowerFritz18-MV4                         : 3414    5   4294    36.8 %   3507    97/256    2.5s    4.1s   0.25
 26 Halogen11-MV4                            : 3406    5   4197    35.7 %   3509    93/256    2.5s    4.1s   0.24



MST1  : Mean solution time (solved positions only)
MST2  : Mean solution time (solved and unsolved positions)
RIndex: Score according to solution time ranking for each position
Peter.
Ferdy
Posts: 4845
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS re-re-re-re-re-visited

Post by Ferdy »

Created an STS web app. You can view the top 10 pv's according to stockfish. Analysis are done. Checking is not yet done.

Will publish the revised STS once the check is completed.

The analysis files are in google folder.

The source code of this app is in my github repo. It is freely hosted by streamlit cloud.

The column has menu, you can search epd, etc.

Image
Pvt. Ryan
Posts: 52
Joined: Mon Sep 12, 2022 3:50 am
Location: Christchurch, NZ
Full name: Ray Bongalon

Re: STS re-re-re-re-re-visited

Post by Pvt. Ryan »

Hi. Can I use this tool to use any EPD file instead of STS?

Cheers.
Ferdy
Posts: 4845
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: STS re-re-re-re-re-visited

Post by Ferdy »

Pvt. Ryan wrote: Sun Dec 04, 2022 3:00 am Hi. Can I use this tool to use any EPD file instead of STS?

Cheers.
You can. Your epd would look something like this:

Code: Select all

4r2k/p1r1b1pp/1p1pqn2/4p3/1PP4B/P4B1P/4QPP1/2RR2K1 w - - bm Bxf6; c0 "Bxf6=10, Re1=5, Rc2=4, Qd2=1";
Pvt. Ryan
Posts: 52
Joined: Mon Sep 12, 2022 3:50 am
Location: Christchurch, NZ
Full name: Ray Bongalon

Re: STS re-re-re-re-re-visited

Post by Pvt. Ryan »

Ferdy wrote: Sun Dec 04, 2022 3:20 am
Pvt. Ryan wrote: Sun Dec 04, 2022 3:00 am Hi. Can I use this tool to use any EPD file instead of STS?

Cheers.
You can. Your epd would look something like this:

Code: Select all

4r2k/p1r1b1pp/1p1pqn2/4p3/1PP4B/P4B1P/4QPP1/2RR2K1 w - - bm Bxf6; c0 "Bxf6=10, Re1=5, Rc2=4, Qd2=1";
Thanks, Ferdy. I'm actually planning to use this tool instead of the 'Automatic Analysis' feature of Arena.
Carbec
Posts: 160
Joined: Thu Jan 20, 2022 9:42 am
Location: France
Full name: Philippe Chevalier

Re: STS re-re-re-re-re-visited

Post by Carbec »

Hello,

I am very interested to use it to develop my own little engine.
Is it possible to use it with Linux (Ubuntu) ? Sorry if this question is naive,
Im beginning with Linux.

Thanks
Philippe
smatovic
Posts: 3193
Joined: Wed Mar 10, 2010 10:18 pm
Location: Hamburg, Germany
Full name: Srdja Matovic

Re: STS re-re-re-re-re-visited

Post by smatovic »

Carbec wrote: Fri Mar 31, 2023 3:37 pm Hello,

I am very interested to use it to develop my own little engine.
Is it possible to use it with Linux (Ubuntu) ? Sorry if this question is naive,
Im beginning with Linux.

Thanks
Philippe
Take a look at Ferdy's repository, there is a Python script + several STS LAN .epd files:

https://github.com/fsmosca/STS-Rating

https://github.com/fsmosca/STS-Rating/tree/master/epd

--
Srdja