Stripping a pgn

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

MikeGL
Posts: 1010
Joined: Thu Sep 01, 2011 2:49 pm

Re: Stripping a pgn

Post by MikeGL »

Ozymandias wrote:
Lyudmil Tsvetkov wrote:Anyone knowing of a trick to strip the pgn to just the moves played for both sides?
Even better, if this could be done with the whole pgn file.
Use pgn-extract, this is the command line:

Code: Select all

pgn-extract -s -C -N -V -oclean.pgn input.pgn
This will clean the pgn of variations, comments and symbols. If you want to see what it does, remove the silent option (-s).
Lyudmil Tsvetkov wrote:Btw., I am reading WikiSpaces are going to close.
What will happen with the Chess Programming Wiki, any clue?
There's a specific thread for that.
I think the command depends on the version? I use the following version of pgn-extract: (version 16.7)

Code: Select all

pgn-extract v16.7 (Jul 18 2008): a Portable Game Notation (PGN) manipulator.
Copyright (C) 1994-2007 David J. Barnes (d.j.barnes@kent.ac.uk)
http://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/
And to clean a PGN file, one needs to redirect the output to something using
the DOS redirection >> operator like so:

Code: Select all

pgn-extract -s -C -N -V ThisIsUncleanInput.pgn >> CleanedOutput.pgn
Note that the input pgn doesn't have a prefixed (hyphen or dash -),
I think that's why there's error about - in OP's attempt.

Edit: sorry, you are also correct. the -o switch dictates the output file. :oops:
Also, I think I figured out how he got the error, he types pgn-extract first (which triggers INTERACTIVE mode)
and then paste another pgn-extract -s -C -N -V command while inside the program.
It shoud be pasted directly in command prompt just ONCE, not twice.
I told my wife that a husband is like a fine wine; he gets better with age. The next day, she locked me in the cellar.
Colin-G
Posts: 191
Joined: Mon Oct 31, 2016 6:30 pm
Location: England

Re: Stripping a pgn

Post by Colin-G »

I still use the original Scid database program by Shane Hudson, v3.5 in linux, and v3.6.1 in Windows
Open the pgn file and then select
Tools >> Export All Filter Games >> Export Filter to PGN File...
The following dialog window appears, where you can select what items of the pgn file you want to export to a new pgn file.

Image
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Stripping a pgn

Post by Eelco de Groot »

Shredder can export clean pgn's too. Trial versions could do it. Only the game that is loaded in the GUI I think, not whole tournament files. It sometimes has trouble though with Chessbase pgn's with very convoluted variations. But probably Lyudmil has some obscure old Arena version from the 90's :) It woud have helped if you said what GUI you are using Lyudmil. Can't Arena do this?

(If I can't get Shredder to work, I would try the excellent capabilities of SCID indeed. I have HIARCS, a Chessbase version on an old computer if necessary. The free Chessbase readers I don't know if they can strip pgn, probably not)
Last edited by Eelco de Groot on Fri Feb 16, 2018 6:57 pm, edited 1 time in total.
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Re: Stripping a pgn

Post by Lyudmil Tsvetkov »

Ozymandias wrote:
Lyudmil Tsvetkov wrote:
Ozymandias wrote:
Lyudmil Tsvetkov wrote:Where should the pgn extract exe be placed, in the same folder with the files?

When I tried above command, I get the following:

Processing stdin
pgn-extract -s -C -N -V -oclean.pgn TCEC_Season_5
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
File stdin: Line number: 1
Unknown move text pgn.
File stdin: Line number: 1
Unknown move text T.
File stdin: Line number: 1
Unknown character C (Hex: 43).
Missing result.

Any clue what is going wrong?
The pgn file should be in the pgn-extract extract folder, for simplicity (I haven't even tried to redirect to another folder). As you suspect, oclean.pgn and input.pgn are the destination and source files, you can obviously change them, just don't use special characters (TCECSeason5, for example, should be fine).
Many thanks.
It still says: Single '-' not allowed.
What could that possibly mean?
'-' is just the dash, right, not an underscore?
No, it's not. I thought that could be it, but then I performed a zoom and saw that wasn't the case. You're writing "TCEC_Season_5.pgn", right? Because I don't see the extension in your post.
No, I am doing everything just as the documentation says to do it, but it still does not work.
Nothing works for me around this time of year.
What could single dash not allowed mean?
Anyone tried to strip TCEC pgns, maybe there is something wrong with them?
Anywhere one can download stripped TCEC pgns?
User avatar
Eelco de Groot
Posts: 4561
Joined: Sun Mar 12, 2006 2:40 am
Full name:   

Re: Stripping a pgn

Post by Eelco de Groot »

- I think is not a dash but a hyphen see http://www.sussex.ac.uk/informatics/pun ... ddash/dash Have you tried Mike Libanan's advice? I'm not a hero with commandline and can't ever remember those commands so I have to resort to looking it up and paste the whole thing. Then I press "Enter"
Debugging is twice as hard as writing the code in the first
place. Therefore, if you write the code as cleverly as possible, you
are, by definition, not smart enough to debug it.
-- Brian W. Kernighan
User avatar
Ozymandias
Posts: 1532
Joined: Sun Oct 25, 2009 2:30 am

Re: Stripping a pgn

Post by Ozymandias »

MikeGL wrote:I think I figured out how he got the error, he types pgn-extract first (which triggers INTERACTIVE mode)
and then paste another pgn-extract -s -C -N -V command while inside the program.
It should be pasted directly in command prompt just ONCE, not twice.
Confirmed, I tried that and got the following:
C:\40H\pgn-extract>pgn-extract
Processing stdin
pgn-extract -s -C -N -V -oclean.pgn TCEC_Season_5.pgn
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
File stdin: Line number: 1
Unknown move text pgn.
File stdin: Line number: 1
Unknown move text T.
File stdin: Line number: 1
Unknown character C (Hex: 43).
Missing result.

[Event "?"]
[Site "?"]
[Date "????.??.??"]
[Round "?"]
[White "?"]
[Black "?"]
[Result "?"]

Internal error: Zero move game with no result
Lyudmil Tsvetkov wrote:What could single dash not allowed mean?
Anyone tried to strip TCEC pgns, maybe there is something wrong with them?
Anywhere one can download stripped TCEC pgns?
There's nothing wrong with them, I cleaned that file at the first go with the line we've been talking about, the difficult thing was to figure out how you got that error message; just do what Mike Libanan says and you're good to go.
User avatar
MikeB
Posts: 4889
Joined: Thu Mar 09, 2006 6:34 am
Location: Pen Argyl, Pennsylvania

Re: Stripping a pgn

Post by MikeB »

Lyudmil Tsvetkov wrote:
Ozymandias wrote:
Lyudmil Tsvetkov wrote:Where should the pgn extract exe be placed, in the same folder with the files?

When I tried above command, I get the following:

Processing stdin
pgn-extract -s -C -N -V -oclean.pgn TCEC_Season_5
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
Single '-' not allowed.
File stdin: Line number: 1
File stdin: Line number: 1
Unknown move text pgn.
File stdin: Line number: 1
Unknown move text T.
File stdin: Line number: 1
Unknown character C (Hex: 43).
Missing result.

Any clue what is going wrong?
The pgn file should be in the pgn-extract extract folder, for simplicity (I haven't even tried to redirect to another folder). As you suspect, oclean.pgn and input.pgn are the destination and source files, you can obviously change them, just don't use special characters (TCECSeason5, for example, should be fine).
Many thanks.
It still says: Single '-' not allowed.
What could that possibly mean?
'-' is just the dash, right, not an underscore?
here you go

https://www.dropbox.com/s/iq5x8j6w1mk6d ... n.zip?dl=1

this worked for me:

Code: Select all

pgn-extract /Users/michaelbyrne/Documents/Results/tcec1127.pgn -s -C -N -V -otcecclean.pgn
- I used the full path since the input file was in a different folder than pgn extract
tpoppins
Posts: 919
Joined: Tue Nov 24, 2015 9:11 pm
Location: upstate

Re: Stripping a pgn

Post by tpoppins »

Norm Pollock wrote:Simple command:

trim file.pgn

In command window.

See 40H-PGN tools on my page. Click www below.
In case you missed the post above, here it is again -- the simplest way. Norm has more than 50 command-line PGN tools like this for virtually any case in life: http://komodochess.com/pub/40H-pgn-utilities.

In Fritz GUIs you can do it, too, but only one game at a time. Load a game, right click anywhere in the notation pane and pick "Delete All Commentary" (or do Ctrl+F2).

Lastly, there's a GUI for PGN-Extract (PGN-Extract Interface by Ferdinand Mosca) if you're not handy with the command line.
Lyudmil Tsvetkov
Posts: 6052
Joined: Tue Jun 12, 2012 12:41 pm

Long Live T. Poppins and Ferdinand Mosca!

Post by Lyudmil Tsvetkov »

Thank you everybody for the suggestions.
I guess I am just in a hurry and missing something, mistyping something or the like.
It finally worked with Ferdie's tool and the very precious Tim's advice.
I love those guys. :D
stevenaaus
Posts: 608
Joined: Wed Oct 13, 2010 9:44 am
Location: Australia

Re: Stripping a pgn

Post by stevenaaus »

pferd wrote:
Lyudmil Tsvetkov wrote: Anyone knowing of a trick to strip the pgn to just the moves played for both sides?
Even better, if this could be done with the whole pgn file.
Or, alternatively, does anyone know if there are stripped TCEC pgn files for download somewhere?

Thanks, hope someone helps.
Using Scid vs PC allows strip comments of single games via Edit->Strip->Comments
The maintenance window will strip all games comments, but is very slow.
In subversion/the windows betas, it has been sped up a bit, ~ 10x.