I think 02,03,13 are ok, but 05,09,10 are not optimal:
In Quick-05, some engines will choose Rxf7+ without having found that it draws, only assigning an eval which is slightly less bad than for the queen move they considered before. Other engines will not sacrifice the rook before they find a 0.00 eval for it (as intended), which may take more time. IOW the "correct move for wrong reason" problem of a test position.
In Quick-09 (am Nxe4?), a few engines choose bad alternatives after they found that Nxe4 is a blunder. I think I have seen at least one engine switch to Nxc6? which "solves" the am position technically... (I do not know if EPD can restrict the acceptable alternatives.)
In Quick-10 with 20.g5!, a strong player pointed me to the later move 23.Nef4 which is crucial. Some engines which decide for 20.g5, may fail to play 23.Nef4 or at least, they do not get it in the test position, yet. Another possible case of "correct move for wrong, or incomplete reason", if our only test condition is to choose the bm (which always was my approach), disregarding anything else.
These problems occur rarely, nevertheless I must admit I failed because my concept, or plan was to have a tactical test set which is small but 100% reliable in terms of correctness AND with sufficient, unambigous bm/am solutions. - But I think 20 or 21 from 24 achieve that goal.
Mike S. wrote:I think 02,03,13 are ok, but 05,09,10 are not optimal:
In Quick-05, some engines will choose Rxf7+ without having found that it draws, only assigning an eval which is slightly less bad than for the queen move they considered before. Other engines will not sacrifice the rook before they find a 0.00 eval for it (as intended), which may take more time. IOW the "correct move for wrong reason" problem of a test position.
In Quick-09 (am Nxe4?), a few engines choose bad alternatives after they found that Nxe4 is a blunder. I think I have seen at least one engine switch to Nxc6? which "solves" the am position technically... (I do not know if EPD can restrict the acceptable alternatives.)
You can have a list of am values just as you can have a list of bm values.
[D]2kr3r/ppp3pp/2pbbn2/4N3/3Pp3/2P3Pq/PP1NQP1P/R1B2RK1 w - - am Nxe4 Nxc6; id Quick-09;
Since the spec does not specifically state otherwise, I have added both am and bm lists to positions (but found that this absolutely horrifies some chess programmers).
In Quick-10 with 20.g5!, a strong player pointed me to the later move 23.Nef4 which is crucial. Some engines which decide for 20.g5, may fail to play 23.Nef4 or at least, they do not get it in the test position, yet. Another possible case of "correct move for wrong, or incomplete reason", if our only test condition is to choose the bm (which always was my approach), disregarding anything else.
These problems occur rarely, nevertheless I must admit I failed because my concept, or plan was to have a tactical test set which is small but 100% reliable in terms of correctness AND with sufficient, unambigous bm/am solutions. - But I think 20 or 21 from 24 achieve that goal.