New test set 'pure'

Discussion of anything and everything relating to chess playing software and machines.

Moderator: Ras

Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

New test set 'pure'

Post by Dann Corbit »

This test set is composed of 6 files:
1) culls.epd Culls are problems which are from another problem, farther towards the end of a PV (so they are already in the set)
2) culls-ana.epd Culls-ana is the culls problem set with analysis

While the culls files are redundant, they are still interesting, perhaps and some are fairly difficult

3) duds.epd Duds are problems that were too easy to solve (less than two seconds with 30 cores using a recent version of SF)
4) duds-ana.epd Duds-ana is the duds problem set with analysis.

The duds problems are too easy for modern computers unless you want to run at one seceond per problem or something like that. But it might be a nice set for cell phones.

5) pure.epd Pure contains the problems left after removing the culls and duds.
6) pure-ana.epd Pure-ana contains the pure problem set with analysis

The pure set varies from fairly easy to solve to extremely difficult to solve.

sharing link:
https://drive.google.com/file/d/1e42RYe ... share_link
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: New test set 'pure'

Post by peter »

Dann Corbit wrote: Sat Oct 21, 2023 4:09 am sharing link:
https://drive.google.com/file/d/1e42RYe ... share_link
Thanks for that, Dann!
Many of the positions I had stored so far only in HHdb.
A run with SF dev. at 30 threads of 16x3.5GHz CPU, MultiPV=4, 15"/position:

Stockfish dev-20231008-7a4de9610
Right solutions: 75 of 138 ; 20:46m

Code: Select all

         1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  19  20
 -------------------------------------------------------------------------------------
   0 |   -   -   -   -   -   -   1   -   4   2   8   2   6   -   -   -   0   -   -   -
  20 |   1   -   -   0   -   -   -   -   0   0   1   1   0  13   -   -   -   -   -   0
  40 |   -   -   -   4   -  15   0   0   0   3   -   3   0   -   -   2  15   0   0  15
  60 |   0   -   5  10   0   -   -   0  15   0   -   -   0   4   -   -   7   0   -   0
  80 |   -   -   4   -   -   -   -   0   1   3  15   4   -   -   -   0   6   -   6   5
 100 |  11   9   0   -   0   -   -   1   8   3   6   -   -   4   0   0   3   0   -   -
 120 |   0   8   0   -   0   -   2   -   0   -   -   1   -   2   -  12   6   0

  TotTime: 24:16m    SolTime: 20:46m
Peter.
Jouni
Posts: 3661
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: New test set 'pure'

Post by Jouni »

In my test suites SF 8.10. was very good. But versions 14.10 and 21.10. are clearly weaker and at SF16 level.
Jouni
Dann Corbit
Posts: 12792
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: New test set 'pure'

Post by Dann Corbit »

Thanks for giving it a try.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: New test set 'pure'

Post by peter »

Dann Corbit wrote: Sun Oct 22, 2023 2:06 am Thanks for giving it a try.
You're welcome.
Don't have so much time at the moment but did let some more engines run the pure.epd without having checked each one single position by myself so far, yet as for the ones I saw or knew already, I still like the idea of using composed studies and at some of them cut first move(s) to make really hard ones better solvable for engines.

15"/position will be shortest TC for hardware stronger than mine (16x3.5 GHz) I'd say, 30" for SMP like this seems to be better, Lc0 scored astonishingly well with 3070ti GPU and 15" only at a single run (82/138 with net 2790M).

Of course composed studies most of the times aren't as near to positions out of practical game playing but to have high tactical difficulty for engines and single best move- solutions they are still some of the best ressources to me. Of course certain branches and settings of engines do score better in such suites then in eng-eng-matches, but thus the results of suites like this one even more are good supplements to me, regards
Peter.
Jouni
Posts: 3661
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: New test set 'pure'

Post by Jouni »

Lousy results :D . Crystal 6 scored 106/138 with 6 cores and 15 sec.
Jouni
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: New test set 'pure'

Post by peter »

Jouni wrote: Mon Oct 23, 2023 11:11 pm Lousy results :D . Crystal 6 scored 106/138 with 6 cores and 15 sec.
HypnoS (Marco Zerbinati's private engine) 116/138 with 30 threads (15 of 16 cores) and 15"
:)
Crystal 6 PMT with same hardware- TC 111/138 single primary variant. MultiPV yet doesn't make performances much higher (SF dev. even a little lower as shown above, single primary was 78)) that means 15" SMP hardware- TC is rather short for this mix of positions.
It's such results of specialised branches and settings that make me think 30" SMP would be better for this suite, especially as for a lower error bar of a broader engine- mix . EloStatTS can adjudicate matches between 2 engines having solved positions likewise by time- indices and this kind of "remis" change to full points eng-eng-matched, positions not solved likewise stay drawn, regards
Peter.
Jouni
Posts: 3661
Joined: Wed Mar 08, 2006 8:15 pm
Full name: Jouni Uski

Re: New test set 'pure'

Post by Jouni »

Also Huntsman score 106 even only searching mates. So there are a lot of mate positions. And some alternative solution moves.
Jouni
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: New test set 'pure'

Post by peter »

Position nr.63 doesn't have solution with good enough discrimination of best move to second best one for an engine- test position in a suite automatically to be evaluated, I'd say.
It's from a study from Miljanic M.,

https://www.yacpdb.org/#search/OHBwcDFQ ... LzEvMA==/1

, from which the first 8 moves are cut off.

8/1pp3p1/p7/6Q1/3p4/8/1ppp2K1/br1k3n w - - acd 163; bm Qg4+; ce 32606; c3 "Qg4+"; dm 81; pm Qg4+; pv Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c6 Qg5 Kd1 Qg4+ Kc1 Qf4 a5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 g6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a3 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a2 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1; id "pure.063";

With best move 1. (9.) Qg4 mate in 81 is given, e.g. 1.Qh5+ (?! at the utmost) instead makes DTM longer only for a few moves, here is Huntsman- output after a quick Forward- Backward:

1.Qh5 ?!
8/1pp3p1/p7/7Q/3p4/8/1ppp2K1/br1k3n b - - 0 1

Analysis by The Huntsman 1:

1...Ke1 2.Qh4+ Kd1 3.Qg4+ Ke1 4.Qe4+ Kd1 5.Qf3+ Kc1 6.Qf4 a5 7.Qg5 Kd1 8.Qg4+ Ke1 9.Qe4+ Kd1 10.Qf3+ Kc1 11.Qf4 c6 12.Qg5 Kd1 13.Qg4+ Ke1 14.Qe4+ Kd1 15.Qf3+ Kc1 16.Qf4 c5 17.Qg5 Kd1 18.Qg4+ Ke1 19.Qe4+ Kd1 20.Qf3+ Kc1 21.Qf4 g6 22.Qg5 Kd1 23.Qg4+ Ke1 24.Qe4+ Kd1 25.Qf3+ Kc1 26.Qf4 a4 27.Qg5 Kd1 28.Qg4+ Ke1 29.Qe4+ Kd1 30.Qf3+ Kc1 31.Qf4 c4 32.Qg5 Kd1 33.Qg4+ Ke1 34.Qe4+ Kd1 35.Qf3+ Kc1 36.Qf4 c3 37.Qg5 Kd1 38.Qg4+ Ke1 39.Qe4+ Kd1 40.Qf3+ Kc1 41.Qf4 a3 42.Qg5 Kd1 43.Qg4+ Ke1 44.Qe4+ Kd1 45.Qf3+ Kc1 46.Qf4 b6 47.Qg5 Kd1 48.Qg4+ Ke1 49.Qe4+ Kd1 50.Qf3+ Kc1
+- (#86) Depth: 78/100 00:02:52 18949MN

Of course it's against the theme of the study to let cyclic zugzwang have a break but it doesn't give away the win and it doesn't cost more than a few moves DTM, so I wouldn't use it as single best move test position in an automatically running suite (in output of course the "reasons" of the engine evalutating the one or the other one solution comparable to each other seeing the output- lines, not only the eval at first ply, than it's fine for testing engines with, why not?) especially if the same study is used in same suite as nr.30 too a few moves earlier started, here as it's given in pure-ana.epd:

4Q3/ppp3p1/8/8/3p4/8/1ppp2K1/br1k3n w - - acd 177; bm Qh5+; ce 32592; c3 "Qh5+"; dm 88; pm Qh5+; pv Qh5+ Ke1 Qh4+ Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 g6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c3 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b4 Qg5 Kd1 Qg4+ Ke1; id "pure.030";

That's the same study with only first move cut off, haven't looked at all the positions in between so far, but GUI tells me, the study is used a third time at position 109 too:

8/ppp3p1/8/8/3p2Q1/8/1ppp2K1/br2k2n w - - acd 171; bm Qe4+; ce 32598; c3 "Qe4+"; dm 85; pm Qe4+; pv Qe4+ Kd1 Qf3+ Kc1 Qf4 a6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 c5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b5 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 a3 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b4 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 b3 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4 g6 Qg5 Kd1 Qg4+ Ke1 Qe4+ Kd1 Qf3+ Kc1 Qf4; id "pure.109";

So this time it starts from move nr.5 of the original study, btw. here we'd have the same problem as with nr. 63, White can delay best move Qe4+ with e.g. Qe6+?! instead, again giving away only some moves DTM but yet again not giving away the still clear win, regards
Peter.
peter
Posts: 3412
Joined: Sat Feb 16, 2008 7:38 am
Full name: Peter Martan

Re: New test set 'pure'

Post by peter »

Next problem like the one shown above there is with nr.67 derived from a Troitzky- study, that had been posted lately here

viewtopic.php?p=953438#p953438

The position given in .epd again cuts off original starting moves but here

8/2K2p1p/6pr/3p2p1/4n1pb/3N1bk1/4prn1/4B1RR w - - acd 145; bm Kd8; ce 32624; c3 "Kd8"; c4 "Troitzky=A; pm=Kd8; 1-0;"; c5 "Troitzky=A; 1-0"; c9 "HHdbVI.1599025.021a.86091"; dm 72; pm Kd8; pv Kd8 Rh5 Ke7 Rh6 Kf8 Rh5 Kg7 d4 Kf8 Rh6 Ke7 Rh5 Kd8 Rh6 Kc7 Rh5 Kb6 Rh6 Ka5 Rh5 Kb4 Rh6 Ka3 Rh5 Kb2 Rh6 Kc2 Rh5 Kc1 Rh6 Kb2 Rh5 Ka3 Rh6 Kb4 Rh5 Ka5 Rh6 Kb6 Rh5 Kc7 Rh6 Kd8 Rh5 Ke7 Rh6 Kf8 Rh5 Kg7 f6 Kg8 Rh6 Kf8 Rh5 Kg7 f5 Kf8 Rh6 Ke7 Rh5 Kd8 Rh6 Kc7 Rh5 Kb6 Rh6 Ka5 Rh5 Kb4 Rh6 Ka3 Rh5 Kb2 Rh6 Kc1 Rh5 Kc2 Rh6 Kb2 Rh5 Ka3 Rh6 Kb4; id "pure.067";

Second best move instead of 1.Kd8, 1.Kb8 (?!) is only 2 moves longer to mate:

8/2K2p1p/6pr/3p2p1/4n1pb/3N1bk1/4prn1/4B1RR w - - 0 1

Analysis by The Huntsman 1:

1. +- (#72): 1.Kd8 Th5 2.Ke7 Th6 3.Kf8 Th5 4.Kg7 d4 5.Kf8 Th6 6.Ke7 Th5 7.Kd8 Th6 8.Kc7 Th5 9.Kb6 Th6 10.Ka5 Th5 11.Kb4 Th6 12.Ka3 Th5 13.Kb2 Th6 14.Kc2 Th5 15.Kc1 Th6 16.Kb2 Th5 17.Ka3 Th6 18.Kb4 Th5 19.Ka5 Th6 20.Kb6 Th5 21.Kc7 Th6 22.Kd8 Th5 23.Ke7 Th6 24.Kf8 Th5 25.Kg7 f6 26.Kf8 Th6 27.Kg8 Th5 28.Kg7 f5 29.Kf8 Th6 30.Ke7 Th5 31.Kd8 Th6 32.Kc7 Th5 33.Kb6 Th6 34.Ka5 Th5 35.Kb4 Th6 36.Ka3 Th5 37.Kb2 Th6 38.Kc1 Th5 39.Kc2 Th6 40.Kb2 Th5 41.Ka3 Th6 42.Kb4 Th5 43.Ka5 Th6 44.Kb6 Th5 45.Kc7 Th6 46.Kd8 Th5 47.Ke7 Th6 48.Kf8 Th5 49.Kg7 f4 50.Kf8 Th6 51.Ke7 Th5 52.Kd8 Th6 53.Kc7 Th5 54.Kb6 Th6

2. +- (#74): 1.Kb8 Th5 2.Kc7 Th6 3.Kd8 Th5 4.Ke7 Th6 5.Kf8 Th5 6.Kg7 d4 7.Kf8 Th6 8.Ke7 Th5 9.Kd8 Th6 10.Kc7 Th5 11.Kb6 Th6 12.Ka5 Th5 13.Kb4 Th6 14.Ka3 Th5 15.Kb2 Th6 16.Kc2 Th5 17.Kc1 Th6 18.Kb2 Th5 19.Ka3 Th6 20.Kb4 Th5 21.Ka5 Th6 22.Kb6 Th5 23.Kc7 Th6 24.Kd8 Th5 25.Ke7 Th6 26.Kf8 Th5 27.Kg7 f6 28.Kf8 Th6 29.Kg8 Th5 30.Kg7 f5 31.Kf8 Th6 32.Ke7 Th5 33.Kd8 Th6 34.Kc7 Th5 35.Kb6 Th6 36.Ka5 Th5 37.Kb4 Th6 38.Ka3 Th5 39.Kb2 Th6 40.Kc1 Th5 41.Kc2 Th6 42.Kb2 Th5 43.Ka3 Th6 44.Kb4 Th5 45.Ka5 Th6 46.Kb6 Th5 47.Kc7 Th6 48.Kd8 Th5 49.Ke7 Th6 50.Kf8 Th5 51.Kg7 f4 52.Kf8 Th6 53.Ke7 Th5 54.Kd8
Peter.