Any Testsuites in EPD format you can recommend?

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Any Testsuites in EPD format you can recommend?

Post by lithander »

I have used the ECM.epd to test different versions of my engine at a fixed depth. I measure how many nodes were visited, how fast the test terminated and of course how many positions my engine solved.

I ran a test today of a promising new version which was considerably faster but it solved 5 less positions. Disappointed I looked closer into the positions in question and had them all analyzed by chess.com and to my dismay my new version often was either giving an equal or even better bestmove then what was supposed to be best according to the EPD. That means it "failed" tests even though it solved the positions adequately.

Because I never really looked into these details I may have discarded many version for solving less positions that didn't do anything wrong but just found valid alternatives instead. If two moves guarantee a draw (according to chess.com) why would the EPD only list one of them as best move??

Can anyone point me to a better testsuite for my engine where the given best move is actually best - or where there are multiple best moves given if there is not one clearly correct answer?

Here I found a repository with lots of test... but I don't know which ones are "good" to use?
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Any Testsuites in EPD format you can recommend?

Post by Vinvin »

Hi Lithander,
From the page https://www.chessprogramming.org/Test-P ... est_Suites

I see 2 interesting test suites :

1) The Win at Chess (aka "WAC") revision from 2018 (200 positions) where a lot of bad positions were deleted : http://www.talkchess.com/forum3/viewtop ... 80#p762480

2) The Strategic_Test_Suite (aka "STS") https://www.chessprogramming.org/Strategic_Test_Suite
a lot of positions, but you can use short time (5 to 10 seconds per positions).
RubiChess
Posts: 584
Joined: Fri Mar 30, 2018 7:20 am
Full name: Andreas Matthies

Re: Any Testsuites in EPD format you can recommend?

Post by RubiChess »

I'm using the Arasan test suite for benchmarking progress. Spending 60 seconds on every position.
But don't give too much on how many positions are solved. It can decrease although the engine got better.
What helped me using the test suites in several cases was debugging why in several positions the best move was not found (or found very late). This may lead to code or parameter changes that avoids bad pruning or extends important lines and if these changes pass a usual SPRT test... Elo!
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Any Testsuites in EPD format you can recommend?

Post by Ferdy »

lithander wrote: Thu Jul 22, 2021 4:45 pm I have used the ECM.epd to test different versions of my engine at a fixed depth. I measure how many nodes were visited, how fast the test terminated and of course how many positions my engine solved.

I ran a test today of a promising new version which was considerably faster but it solved 5 less positions. Disappointed I looked closer into the positions in question and had them all analyzed by chess.com and to my dismay my new version often was either giving an equal or even better bestmove then what was supposed to be best according to the EPD. That means it "failed" tests even though it solved the positions adequately.

Because I never really looked into these details I may have discarded many version for solving less positions that didn't do anything wrong but just found valid alternatives instead. If two moves guarantee a draw (according to chess.com) why would the EPD only list one of them as best move??

Can anyone point me to a better testsuite for my engine where the given best move is actually best - or where there are multiple best moves given if there is not one clearly correct answer?

Here I found a repository with lots of test... but I don't know which ones are "good" to use?
You might want to try this revised STS. The sts_rating.py in that repo with revised STS can approximate an engine's rating based on CCRL rating list on different engine strengths. Sample output has summaries for top and worst test suite number.

If you are just interested on the epd, it has fields with solutions in uci move format saving you from parsing the san format, see opcode c9. Example.

Code: Select all

1kr5/3n4/q3p2p/p2n2p1/PppB1P2/5BP1/1P2Q2P/3R2K1 w - - bm f5; id "STS(v1.0) Undermine.001"; c0 "f5=10, Be5+=2, Bf2=3, Bg4=2"; c7 "f5 Be5+ Bf2 Bg4"; c8 "10 2 3 2"; c9 "f4f5 d4e5 d4f2 f3g4";
See the STS1-STS15_LAN_v3.epd in the repo. The corresponding points is in c8.

Perhaps some solutions are no longer optimal if we are going to reanalyze this with sf14 for example. But overall the scoring methods can differentiate which one is strong and which one is weaker.

The regression slope and intercept is at https://github.com/fsmosca/STS-Rating/b ... ng.py#L694
in the sts_rating.py source. This was done on older chess engines.

Would be interesting to see the results comparing the old and new versions of your engine.

I have no time at the moment I plan to rescore this test suite with stockfish.
User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Any Testsuites in EPD format you can recommend?

Post by lithander »

Thanks for the input everyone!

Especially the detailed report of the "revised STS" sounds very promising. I'm away from my computer for a week now but I'll give it a try afterwards! :)
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
User avatar
lithander
Posts: 880
Joined: Sun Dec 27, 2020 2:40 am
Location: Bremen, Germany
Full name: Thomas Jahn

Re: Any Testsuites in EPD format you can recommend?

Post by lithander »

Vinvin wrote: Thu Jul 22, 2021 5:17 pm 1) The Win at Chess (aka "WAC") revision from 2018 (200 positions) where a lot of bad positions were deleted : viewtopic.php?p=762480#p762480
Searched 200 positions to depth 11. 709299K nodes visited. Took 984.981 seconds!
Best move found in 194 / 200 positions!

It seems like the tests are a bit on the easy end but maybe that's good because I can lower the depth (or introduce a mode that uses a time budget instead) and get results much quicker. In any case it's good to have another (and hopefully more correct) test suite available! Thanks ;)
Minimal Chess (simple, open source, C#) - Youtube & Github
Leorik (competitive, in active development, C#) - Github & Lichess
Vinvin
Posts: 5228
Joined: Thu Mar 09, 2006 9:40 am
Full name: Vincent Lejeune

Re: Any Testsuites in EPD format you can recommend?

Post by Vinvin »

lithander wrote: Fri Jul 23, 2021 4:36 pm
Vinvin wrote: Thu Jul 22, 2021 5:17 pm 1) The Win at Chess (aka "WAC") revision from 2018 (200 positions) where a lot of bad positions were deleted : viewtopic.php?p=762480#p762480
Searched 200 positions to depth 11. 709299K nodes visited. Took 984.981 seconds!
Best move found in 194 / 200 positions!

It seems like the tests are a bit on the easy end but maybe that's good because I can lower the depth (or introduce a mode that uses a time budget instead) and get results much quicker. In any case it's good to have another (and hopefully more correct) test suite available! Thanks ;)
I didn't know the strength of your engine.
But yes, you can set the level to 1 second or 1 million nodes to get a fast result.