I'm in the process of creating my own "perfect" book for Arena and CB guis. I started with a collection of opening lines in pgn format. I used Pgn-Extract to clean this pgn file and remove duplicate lines. Pgn-Extract only removes lines with the exact same moves. There are many lines with duplicate final positions, but with a different move order.
How do i find and remove the positions with the same final position and differing move orders?
Thanks
Dave
Duplicate positions??
Moderators: hgm, Rebel, chrisw
Re: Duplicate positions??
Hi Dave ,
You need to use first PgnScanner 0.75 by Gabriel Guillory (http://transversale.fr/pgnscanner/pgnscanner_eng.htm), type:
verbose on
open Perfect.pgn
dbl -ply=99 -occ=1 -out=doubles.pgn
exit
In this way You will detect partial or full doubles until a given ply. Doubles with transpositions are included so it is possible to get not exactly identical games since moves sequences order can be different. A string as "there are X other doubles until ply=Y" is added in the "Annotator" pgn-tag of the selected games.
Now You can use pgn-extract to build a clean.pgn, type:
pgn-extract -llogfile.txt -D -oclean.pgn perfect.pgn doubles.pgn
Ciao ,
Salvo
You need to use first PgnScanner 0.75 by Gabriel Guillory (http://transversale.fr/pgnscanner/pgnscanner_eng.htm), type:
verbose on
open Perfect.pgn
dbl -ply=99 -occ=1 -out=doubles.pgn
exit
In this way You will detect partial or full doubles until a given ply. Doubles with transpositions are included so it is possible to get not exactly identical games since moves sequences order can be different. A string as "there are X other doubles until ply=Y" is added in the "Annotator" pgn-tag of the selected games.
Now You can use pgn-extract to build a clean.pgn, type:
pgn-extract -llogfile.txt -D -oclean.pgn perfect.pgn doubles.pgn
Ciao ,
Salvo
-
- Posts: 900
- Joined: Wed Mar 08, 2006 9:06 pm
Re: Duplicate positions??
Hi SalvoSalvoSpit wrote:Hi Dave ,
You need to use first PgnScanner 0.75 by Gabriel Guillory (http://transversale.fr/pgnscanner/pgnscanner_eng.htm), type:
verbose on
open Perfect.pgn
dbl -ply=99 -occ=1 -out=doubles.pgn
exit
In this way You will detect partial or full doubles until a given ply. Doubles with transpositions are included so it is possible to get not exactly identical games since moves sequences order can be different. A string as "there are X other doubles until ply=Y" is added in the "Annotator" pgn-tag of the selected games.
Now You can use pgn-extract to build a clean.pgn, type:
pgn-extract -llogfile.txt -D -oclean.pgn perfect.pgn doubles.pgn
Ciao ,
Salvo
Thank you very much. I already have PgnScanner 0.75. I'll try your suggestion shortly.
Regards
Dave
-
- Posts: 900
- Joined: Wed Mar 08, 2006 9:06 pm
Re: Duplicate positions??
Hi Salvo
For some reason, this doesn't seem to work for me. I followed your instructions exactly, but zero doubles are found, I even manually copied a game in Perfect.pgn so there were two exact copies of the same game. PgnScanner didn't find any doubles.
Pgn-Extract will find exact doubles, but not the same final position doubles such as this simple example ...
1. e4 e5 2. Nf3 Nc6
1. Nf3 Nc6 2. e4 e5
Different moves but same final position. I know there are many such dupes in Perfect.pgn. I suppose it wouldn't hure to have these duplicate position lines in my book. It would just create an unnecessarily large book. After all, a "Perfect" opening book needs to be perfect.
Regards
Dave
For some reason, this doesn't seem to work for me. I followed your instructions exactly, but zero doubles are found, I even manually copied a game in Perfect.pgn so there were two exact copies of the same game. PgnScanner didn't find any doubles.
Pgn-Extract will find exact doubles, but not the same final position doubles such as this simple example ...
1. e4 e5 2. Nf3 Nc6
1. Nf3 Nc6 2. e4 e5
Different moves but same final position. I know there are many such dupes in Perfect.pgn. I suppose it wouldn't hure to have these duplicate position lines in my book. It would just create an unnecessarily large book. After all, a "Perfect" opening book needs to be perfect.
Regards
Dave
Re: Duplicate positions??
Hi Dave ,
if the games are in this form:
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. e4 e5 2. Nf3 Nc6 1/2-1/2
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. Nf3 Nc6 2. e4 e5 1/2-1/2
You can use SCID:
1- File-->New-->Perfect.si3
2- Window-->Maintenance window-->Delete twin games-->(Set only these options:
First 4 letters only, Alll games in the database, shorter game. Unflag all other options.
3 - Press the Delete games button-->Press OK-->Press Close
4 - Window-->Maintenance window-->Compact database-->compact game file-->Press OK
5 - Close SCID
6- Open newly Perfect.si3
7- Tools-->Export all filter games
You now have the clean perfect.pgn
Ciao ,
Salvo
if the games are in this form:
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. e4 e5 2. Nf3 Nc6 1/2-1/2
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. Nf3 Nc6 2. e4 e5 1/2-1/2
You can use SCID:
1- File-->New-->Perfect.si3
2- Window-->Maintenance window-->Delete twin games-->(Set only these options:
First 4 letters only, Alll games in the database, shorter game. Unflag all other options.
3 - Press the Delete games button-->Press OK-->Press Close
4 - Window-->Maintenance window-->Compact database-->compact game file-->Press OK
5 - Close SCID
6- Open newly Perfect.si3
7- Tools-->Export all filter games
You now have the clean perfect.pgn
Ciao ,
Salvo
-
- Posts: 900
- Joined: Wed Mar 08, 2006 9:06 pm
Re: Duplicate positions??
Hi Salvo.SalvoSpit wrote:Hi Dave ,
if the games are in this form:
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. e4 e5 2. Nf3 Nc6 1/2-1/2
[Event "?"]
[Site "?"]
[Date "2007.06.11"]
[White "?"]
[Black "?"]
[Result "1/2-1/2"]
1. Nf3 Nc6 2. e4 e5 1/2-1/2
You can use SCID:
1- File-->New-->Perfect.si3
2- Window-->Maintenance window-->Delete twin games-->(Set only these options:
First 4 letters only, Alll games in the database, shorter game. Unflag all other options.
3 - Press the Delete games button-->Press OK-->Press Close
4 - Window-->Maintenance window-->Compact database-->compact game file-->Press OK
5 - Close SCID
6- Open newly Perfect.si3
7- Tools-->Export all filter games
You now have the clean perfect.pgn
Ciao ,
Salvo
I don't currently have Scid, but i'll download it and try your suggestion.
Thanks
Dave