The following files are OK for me ...
_in_criteria_engine-analysis.epd (positions = 25308)
_not_in_criteria_engine-analysis.epd (positions = 838)
The following files I need ...
The *.pgn of the 25308
The *.pgn of the 838
And the *.pgn after reject 838 found positions from the beta-v03.pgn (included here the move transpositions) ... I think this is:
_in_criteria_not_unique_beta-v03.pgn (games = 80626)
I don't need ...
_in_criteria_unique_beta-v03.pgn (games = 25472)
Note: The *.pgn of 25472 isn't the same as the *.pgn with 25308 I need. This one is a bit confused.
_not_in_criteria_unique_beta-v03.pgn (games = 12461)
Note: Can nothing do with it.
_not_in_criteria_not_unique_beta-v03.pgn (games = 56491)
Note: Can nothing do with it.
Best
Frank
I think to get what you wanted, use the following criteria.txt.
I checked all the output files your program generated and can't find any mistake.
I am thinking I have a mistake in reasoning with my example start of the posting that anything must be wrong.
Maybe this one:
If Houdini found 838 / 26146 ...
Absolutely possible that ...
Example:
After end of line with move 8 an other line can be ended with move 14 to an other ECO Code. Both lines are the same undo move 8 but quiet different to the end of line. Both lines are included in the 26.146 database, and no tool can be realize a move transposition.
This one can be the reason for my first example with the differents I found out.
Again the output files your program generated must be fully OK because I check all of them and found nothing, means no mistakes.
---
But better is to have the 838 games Houdini found in *.pgn format too.
Also two output files should be delete, I wrote before.
With criteria.txt it should be possible to set for different engines the settings.
The topic is very complicated. Tomorrow is also a good day to thinking about it. For today I have enough.
It seems that my download file after the Shredder analysis are fully OK.
for Houdini (1) the *.pgn is beta_v03.pgn with 81963 (with move transpositions).
for Shredder (2) the *.pgn is v01.pgn with 80626 (with move transpositions).
for Fire (3) the *.pgn is v02.pgn with 80013 (with move transpositions).
The *.epd file will be allways the same "beta-v03_endpos-n.epd".
---
And all 10 engines analyzed the same *.epd. Good to compare later with Excel or other programs, like LibreOffice. But after each engine analysis I have a corrected *.pgn file.
---
With other words ...
Houdini found 838 positions without move transposition or 1337 positions with move transpositions.
Shredder found 438 positions without move transpositions or 613 positions with move transpositions. In the same 26146 main database.
And for Fire I have to set the final result after I rejected with Houdini and Shredder (80013) as v03.pgn in criteria.txt.
I am thinking this one should be OK!
End of the project after 10 engines analyzed the 26146 I have maybe 77000 positions with move transpositions or 24000 without move transpositions. And the perfect database for creating opening books is ready and checked by 10 very strong engines.
I checked all the output files your program generated and can't find any mistake.
I am thinking I have a mistake in reasoning with my example start of the posting that anything must be wrong.
Maybe this one:
If Houdini found 838 / 26146 ...
Absolutely possible that ...
Example:
After end of line with move 8 an other line can be ended with move 14 to an other ECO Code. Both lines are the same undo move 8 but quiet different to the end of line. Both lines are included in the 26.146 database, and no tool can be realize a move transposition.
This one can be the reason for my first example with the differents I found out.
I am using pgn-extract to extract the pgn and I agree with your observation, if pos1 is reached in 8 moves as end pos in one line, and another line up to 14 moves but with same pos1 after 8 moves then pgn-extract will output the two lines because pos1 is found in these 2 pgn lines.
The solution is to use a refpgn with end position of epd only.
I attempted to create a script that will only extract the pgn having end position with the epd but it takes a lot of time if refpgn is big and filtered epd is big too.