looking for a complex PGN file for testing PGN parser

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: looking for a complex PGN file for testing PGN parser

Post by bob »

trojanfoe wrote:
bob wrote:
casaschi wrote:Hello.

I'm looking for a complex PGN file for testing a PGN parser. Something with complex variations and a variety of test cases.

I thought something like that should be already available, but I could not find anything from google.

Anyone with a link to such a test PGN file?

Thanks in advance.
The "enormous.pgn" file on my ftp box is daunting. In one game, there are comments nested 17 levels deep. If that doesn't break your parsing, it will probable work for most everything. You will also find all the usual nonsense, wrong o-o characters (should be alpha-o, not zero), and moves like 1.e4 with no space, etc...
That is actually a very useful file Bob - I found several bugs in my PGN scanner/parser by testing against it. Many thanks!

-A
as did I. Dann Corbit created the file years ago...
casaschi
Posts: 164
Joined: Wed Dec 23, 2009 1:57 pm

Re: looking for a complex PGN file for testing PGN parser

Post by casaschi »

bob wrote:
trojanfoe wrote: That is actually a very useful file Bob - I found several bugs in my PGN scanner/parser by testing against it. Many thanks!
as did I. Dann Corbit created the file years ago...
Thanks to all that replied.

I was hoping someone already put togheter a manually crafted list of things likely to break your PGN parser.

So far I was surprised by empty variations, like

Code: Select all

1. e4 e6 () 2. d4 d5
and by variations before the mainline, like

Code: Select all

(1. d4 d5) 1. e4 e6 2. d4 d5
Both illegal, I suppose, but you want your parser to cope as much as possible.

Unfortunately, the brute force of testing against enormous.pgn does not work for me, my PGN parser is in javascript and even with the best browser it's probably not a good idea to try a gigabyte of PGN :-)
Dave_N
Posts: 153
Joined: Fri Sep 30, 2011 7:48 am

Re: looking for a complex PGN file for testing PGN parser

Post by Dave_N »

My parser relies upon the '[' and ']' being at the beginning and end of a line, any tags in between would be moved into 1 header element.
I think tags need to be on separate lines and the game text needs an empty line after the last tag, otherwise I can imagine jargon being detected as a valid game list with one or two '[' characters at the beginning of lines, even 1 at the start of a line could cause errors like bad games with extra large headers.

I need move numbers, I didn't realize the list of moves would be part of the format.

The "; comment" would be noted as an invalid move.


Thanks for posting this I am actually trying to increase flexibility with a prescan in my parser.

atm I load any junk that resembles a game and then check that the moves are legal moves afterwards. Perhaps suggesting that the user first cleans the pgn with pgn-extract is sufficient.
Last edited by Dave_N on Wed Mar 07, 2012 9:34 am, edited 1 time in total.
Dave_N
Posts: 153
Joined: Fri Sep 30, 2011 7:48 am

Re: looking for a complex PGN file for testing PGN parser

Post by Dave_N »

edit: accidental double post.

Good space to add this ... I started with a complicated pgn created in a simple pgn editor. I made variations of both color moves with sub variations of both color moves ... then variations of 1 move. Then I added comments. Once that worked I started working on legal moves and badly formatted san strings.
nkg114mc
Posts: 74
Joined: Sat Dec 18, 2010 5:19 pm
Location: Tianjin, China
Full name: Chao M.

Re: looking for a complex PGN file for testing PGN parser

Post by nkg114mc »

Hi, you can try this huge game, which always get my parser in trouble.

I'm sure it is correct. This game can be parsed by most stable PGN readers, such like the reader in Arena.

Try this, and good luck :wink: ~

Code: Select all

[Event "Tan Chin Nam"]
[Site "Beijing"]
[Date "1998.06.13"]
[Round "6"]
[White "Zhu Chen"]
[Black "Kudrin, Sergey"]
[Result "0-1"]
[Annotator "Stohl"]
[BlackElo "2565"]
[ECO "A48"]
[EventDate "1998.06.08"]
[PlyCount "142"]
[Source "ChessBase"]
[SourceDate "1998.09.30"]
[WhiteElo "2490"]

1. d4 { Hecht}  1... Nf6 2. Nf3 g6 3. Bg5 Bg7 4. c3 O-O 5. Nbd2 d6 6. e4 c5 7.
dxc5 dxc5 8. Bc4 Qc7 ( 8... Nc6 9. O-O ( 9. h3 b6 ( 9... Qc7 $5 { -8...Qc7}  )
10. O-O Bb7 11. Qe2 Nd7 12. Rad1 Qc7 13. Bd5 Nf6 14. Bb3 h6 15. Be3 Rad8 16.
Nh4 Na5 17. Bc2 e5 $6 18. Rfe1 Nh5 19. Nhf3 a6 20. Nc4 Nc6 21. a4 Rfe8 22. Rxd8
$1 Rxd8 23. Rd1 $36 &#123; ><b6,e5,Miles,A-Blocker,C/Philadelphia op/1987/ A good
example of what Black should avoid in this line.&#125;  ) 9... Qc7 10. Re1 h6 11.
Bh4 Nh5 12. Qe2 g5 13. Bg3 Nxg3 14. hxg3 $11 &#123; 1/2,Tyomkin,D-Kudrin,S/North Bay
op/1998/&#125;  ) 9. Qe2 Nc6 10. h3 $6 &#123; White tries to preserve her Bg5 and avois
lines as in the game quoted above. But this costs time and gives Black tactical
chances, which Kudrin nicely exploits.&#125;  ( 10. O-O $142 h6 ( 10... Na5 11. Bd3
h6 12. Bh4 Nh5 13. Qe3 Rd8 14. Bc2 g5 15. Bg3 Nxg3 16. fxg3 $5 Be6 17. e5 Qc6
18. Ne4 Nc4 19. Qxc5 Qxc5+ 20. Nxc5 Bd5 21. Bb3 Rac8 22. Nd3 Ne3 23. Bxd5 Nxd5
24. Rfd1 Ne3 25. Rd2 Nc4 26. Rdd1 g4 27. Nd4 e6 28. Re1 Rd5 29. Re4 $14 &#123;
Zilberman,N-Yurtaev,L/Frunze/1989/&#125;  ) 11. Bxf6 $5 ( 11. Bh4 Nh5 12. Qe3 g5 $5
( 12... b6 13. Rfe1 g5 14. Bxg5 hxg5 15. Qxg5 Nf4 $1 ( 15... Nf6 16. e5 Ng4 17.
Bd5 Rb8 18. e6 f5 19. Nf1 Nf6 20. Ng3 Nh7 21. Qg6 Kh8 22. Nxf5 Rg8 23. Ng5 &#123;
1-0,Speelman,J-Howell,J/Calcutta op/1996/&#125;  ) 16. e5 Ne6 17. Qg3 $2 ( 17. Qh5
Nf4 $11 ) 17... Rd8 18. Re4 b5 19. Bxb5 Nxe5 20. Nxe5 Rxd2 21. Bc4 Qd6 22. h3
Bb7 23. Rg4 Bd5 24. Qe3 Bxc4 25. Nxc4 Rd1+ $17 &#123; Kanstler,B-Peker,O/Petach
Tikva/1996/&#125;  ) 13. Bg3 Nxg3 14. hxg3 b6 15. Rfe1 Bb7 16. Qe2 Rad8 17. Nf1 Ne5
18. Nxe5 Qxe5 19. Rad1 Qc7 20. Nh2 Rxd1 21. Rxd1 Rd8 22. Rxd8+ Qxd8 23. Qd3
Qxd3 24. Bxd3 e6 25. Kf1 Kf8 26. Ke2 Ke7 $15 &#123; Prakash,G-Saravanan,V/Goodricke
Calcutta/1997/&#125;  ) 11... exf6 ( 11... Bxf6 12. Qe3 Ne5 13. Nxe5 Bxe5 14. Qxh6
b5 $1 15. Bd5 Bg7 16. Qe3 Rb8 17. a4 e6 18. Bb3 c4 19. Bd1 b4 20. Be2 bxc3 21.
bxc3 Ba6 22. Rfb1 Rfd8 23. Bf1 Rbc8 $14 &#123; <=>,Delemarre,J-Tseitlin,M/Groningen
op/1997/&#125;  ) 12. Nh4 Ne5 13. Bb3 Rd8 14. f4 Bg4 15. Qe3 Rd3 16. Qf2 Rad8 17.
Bd5 R3xd5 18. exd5 Nd3 19. Qe3 Rxd5 20. h3 Bd7 21. c4 f5 22. Nhf3 Rd6 23. Ne5
Nxb2 24. Ndf3 Be6 25. Qb3 Rb6 26. Qc2 Rb4 27. a3 Rxc4 28. Nxc4 Nxc4 29. Rae1
Qxf4 $13 &#123; Stefanova,A-Makropoulou,M/Balkaniad Varna/1994/&#125;  ) 10... h6 11. Be3
( 11. Bxf6 exf6 &#123; /\f5&#125;  ( 11... Bxf6 12. Qe3 Na5 13. Qxh6 $14 ) 12. Nh4 Ne5
13. Bd5 f5 14. f4 Bf6 $36 &#123; ><Ke1&#125;  ) 11... Nh5 $1 &#123; Starting the
aforementioned sacrificial sequence.&#125;  12. Nh2 $6 &#123; Another strange move.&#125;  (
12. Bxc5 $6 Nf4 13. Qf1 Na5 14. Be3 Nxg2+ 15. Qxg2 Nxc4 $17 ) ( 12. g3 $5 b6
$11 ( 12... Na5 13. Bd3 Be6 14. e5 $1 $36 &#123; /\g4&#125;  ) ) 12... b5 $1 ( 12... Nd4
13. Qd3 b5 14. Bd5 ( 14. cxd4 cxd4 15. Bd5 ( 15. Bxd4 bxc4 16. Qc3 Bxd4 17.
Qxd4 Nf4 $15 ) 15... dxe3 16. Bxa8 Nf4 $1 $40 ) 14... c4 15. Qb1 Nc6 16. a4 $13
) 13. Bxb5 $8 Nd4 $1 14. cxd4 &#123; After other moves Black's compensation more
than outweighs his material disadvantage.&#125;  ( 14. Qd3 Nxb5 15. Qxb5 Rb8 16.
Qxc5 Qxc5 17. Bxc5 Rxb2 18. Ba3 ( 18. Bd4 Bxd4 19. cxd4 Ba6 $40 ) 18... Rc2 19.
Kd1 Rxc3 20. Bxe7 Re8 $44 ) ( 14. Bxd4 cxd4 $40 ) 14... cxd4 15. O-O ( 15. Bxd4
Bxd4 $40 &#123; /\&#125;  16. O-O Ng3 $19 ) 15... dxe3 16. Qxe3 Rb8 $17 17. Rac1 ( 17.
Bc4 Rxb2 18. Rac1 Qf4 $36 ) 17... Qf4 18. Qxf4 $8 Nxf4 19. Bc4 Rxb2 20. Nhf3
Be6 21. a3 Rc8 22. g3 $5 &#123; It's not easy to recommend anything else.&#125;  ( 22.
Bxe6 $2 Rxc1 23. Rxc1 Ne2+ $19 ) 22... Nxh3+ 23. Kg2 Bd7 $1 24. e5 Ng5 25. Nxg5
hxg5 26. Ne4 Bxe5 ( 26... Bc6 &#123; runs into&#125;  27. Bd5 $1 Rb6 28. Rc5 e6 29. Bxc6
Rbxc6 30. Ra5 $15 &#123; <=>&#125;  ) 27. Nxg5 e6 28. Rfd1 Rc7 $2 &#123; After this Black has
to win the game all over again.&#125;  ( 28... Bc6+ 29. Kg1 Kg7 $5 $19 &#123; /\Rh8->&#125;  (
29... Bd5 30. Bxd5 Rxc1 31. Rxc1 exd5 $17 ) ) 29. Bxe6 $1 Bc6+ ( 29... Rxc1 30.
Bxf7+ Kg7 31. Rxc1 Bd4 ( 31... Bf6 $6 32. Rc7 $14 ) 32. Ne6+ Bxe6 33. Bxe6 $11
) ( 29... Bxe6 $2 30. Rd8+ Kg7 31. Nxe6+ fxe6 32. Rxc7+ Bxc7 33. Rd7+ $16 &#123; and
suddenly White will be a pawn up.&#125;  ) 30. Kg1 Rb6 ( 30... fxe6 $2 31. Rd8+ Kg7
32. Nxe6+ $18 ) 31. Bd5 ( 31. Ba2 $143 Bb2 $36 ) 31... Bxd5 32. Rxc7 Bxc7 33.
Rxd5 Rd6 34. Rc5 $2 &#123; Difficult to understand. Black's edge is only nominal
after&#125;  ( 34. Rxd6 Bxd6 35. a4 Kf8 36. Kf1 Ke7 37. Ke2 $11 &#123; White should be
able to draw the endgame easily. Now she gets into trouble again.&#125;  ) 34... Bb6
35. Rc4 ( 35. Rc3 Rd2 36. Nh3 ( 36. Ne4 Re2 37. Rc4 f5 $19 ) 36... Kf8 $17 )
35... Rd3 &#123; /\Ra3,Rg3&#125;  36. Ne4 Rxa3 &#123; The _|_ play of both sides is not devoid
of technical faults, with still more to come.&#125;  ( 36... f5 $5 $142 37. Nc5 Rxa3
$19 ) 37. g4 $5 Rd3 38. g5 $17 &#123; White has created potential for <=> against
Black's K and ><f7.&#125;  38... Kf8 39. Rc8+ Rd8 40. Rc6 Ke7 41. Kg2 Rd5 ( 41...
Rb8 &#123; /\a5&#125;  42. Nf6 Bd4 $17 ) 42. f4 Rd3 43. Nf6 Rd6 44. Rc8 Ke6 45. Ne4 Rd3
$6 &#123; In the R_|_ White increases her drawing chances.&#125;  ( 45... Rd1 $5 $17 $142
&#123; /\&#125;  46. Kf3 Rf1+ $1 ) 46. Nc5+ $1 $15 Bxc5 47. Rxc5 Ra3 ( 47... Rd5 48. Rc6+
Rd6 49. Rc7 a6 50. Ra7 &#123; <=>><f7&#125;  ) 48. Rc6+ Ke7 49. Rc7+ Ke6 50. Rc6+ Kd7 51.
Rf6 Ke7 52. Kf2 ( 52. f5 $2 Ra5 $19 ) 52... a5 53. Ra6 $6 ( 53. f5 gxf5 54.
Rxf5 a4 55. Ra5 $11 &#123; should be sufficient for a draw. Now White risks losing
again.&#125;  ) 53... a4 &#123; # Hecht&#58; Nach einigem hin und her ist nun also der weisse
Turm hinter den a-Freibauern gelangt, den der schwarze Turm von v o r n
verteidigt. Was passiert, wenn Weiss mit Kg2 abwartet und den Turm auf a6
belaesst? a )  Schwarz rueckt den Bauern nach a2 vor und greift mit dem Koenig
f4 an; b )  Schwarz beorderrt sofort den Koenig zum Damenfluegel und gibt
zumindest f7 auf.&#125;  54. Ra7+ $2 ( 54. Kg2 &#123; Hecht Analyse zu a )&#125;  54... Ra1
55. Kh2 a3 56. Kg2 ( 56. Ra7+ Ke6 57. Ra6+ Kf5 ( 57... Kd5 &#123; ist auch gut. So
oder so machen die weissen Schachs keinen Sinn.&#125;  ) 58. Rf6+ Kg4 59. Rxf7 Rb1
60. Ra7 Rb2+ 61. Kg1 a2 $19 ) 56... a2 ( 56... Kd7 &#123; Hecht&#125;  57. Kh2 Kc7 58.
Rf6 Kb7 59. Rxf7+ Kb6 60. Rf8 Kb5 61. f5 $132 &#123; Allerdings kann Schwarz
rechtzeitig zum zuerst skizzierten Gewinnplan zurueckkehren.&#125;  ) 57. Kh2 Kd7
58. Kg2 Kc7 59. Kh2 Kb7 60. Ra3 Kb6 61. Kg2 Kb5 62. Ra8 Kb4 63. Rb8+ Kc4 64.
Ra8 Kd4 65. Ra7 Ke4 66. Ra4+ Kf5 $1 &#123; Zugzwang?&#125;  ( 66... Ke3 &#123; Zugzwang?&#125;  67.
Ra7 &#123; Nein!&#125;  67... Kxf4 ( 67... Re1 68. Rxa2 Re2+ 69. Rxe2+ Kxe2 70. Kg3 Ke3
71. Kg4 Kf2 72. f5 Ke3 73. fxg6 fxg6 74. Kg3 $11 ) ( 67... Rd1 68. Rxa2 Rd7 (
68... Rd4 69. Kg3 $11 Rxf4 $4 70. Ra3+ Ke4 71. Ra4+ ) 69. Kg3 $11 ) 68. Rxf7+
Kxg5 69. Ra7 $11 &#123; Der g-Bauer ist nutzlos.&#125;  ) 67. Ra7 &#123; Nein,aber...&#125;  67...
Kg4 $1 &#123; Mit dieser gewaltigen Koenigsposition kann Schwarz leichten Herzens
den a-Bauern aufgeben und sich die weissen Bauern einverleiben.&#125;  68. Ra4 Re1
69. Rxa2 Re4 70. Ra3 Rxf4 71. Rg3+ Kh5 72. Kh3 Rh4+ 73. Kg2 Rg4 $19 ) 54... Ke6
55. Ra6+ ( 55. Ke2 Ra1 56. Ra6+ Kf5 57. Rf6+ Ke4 58. Rxf7 a3 $19 ) ( 55. Kg2 $5
$17 &#123; /\&#125;  55... Ra1 56. Ra6+ Kf5 57. Rf6+ Ke4 58. Rxf7 a3 59. f5 $1 a2 60. Ra7
Kxf5 61. Ra5+ $11 &#123; draws even without the Pg5.&#125;  ) 55... Kf5 56. Rf6+ Kg4 57.
Rxf7 Rf3+ $19 58. Kg2 Rxf4 59. Ra7 Kxg5 &#123; Now with two extra pawns and an
active R the _|_ is trivially won. # Hecht&#58; Vor dem Hintergrund der o.a.
Analysen ist die weisse Aktion zwar verstaendlich, fuehrt aber lediglich zu
einer anderen Verluststellung.&#125;  60. Kg3 Rb4 61. Ra5+ Kh6 62. Ra8 Kh5 ( 62...
g5 $19 &#123; /\g4,K-<<&#125;  ) 63. Ra7 Rb3+ 64. Kg2 a3 65. Ra5+ g5 66. Ra4 g4 67. Ra8
Rc3 68. Kf2 Kg5 69. Ra7 Kf5 70. Ra4 g3+ 71. Kg2 Ke5 &#123; Black's K just walks over
to the o^a3. Hecht&#58; Der Koenig wandert in aller Ruhe zum a-Bauern, waehrend der
weisse abgeschnitten ist. Exitus!&#125;   0-1

User avatar
trojanfoe
Posts: 65
Joined: Sun Jul 31, 2011 11:57 am
Location: Waterlooville, Hampshire, UK

Re: looking for a complex PGN file for testing PGN parser

Post by trojanfoe »

nkg114mc wrote:Hi, you can try this huge game, which always get my parser in trouble.

I'm sure it is correct. This game can be parsed by most stable PGN readers, such like the reader in Arena.

Try this, and good luck :wink: ~
Seems to parse OK for me. What bit does your parser not like?
User avatar
hgm
Posts: 27829
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: looking for a complex PGN file for testing PGN parser

Post by hgm »

What goes wrong in WinBoard is that the very long comment before 1-0 is taken as the "result details", because usually a PGN result goes accompanied by something like "{black resigns}", "{draw agreed}" etc. I guess I should put some filter on there to weed out too-long comments for that purpose.
Dave_N
Posts: 153
Joined: Fri Sep 30, 2011 7:48 am

Re: looking for a complex PGN file for testing PGN parser

Post by Dave_N »

The game tree loads fine, its a pretty good example though. I also tried exporting a merged pgn from a large list of games as a big test.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: looking for a complex PGN file for testing PGN parser

Post by bob »

hgm wrote:That is actually faulty PGN. According to the standard, braces do not nest, and the first } closes the comment, even if there where 20 unclosed { before it. When you write

1. e4 { e5 { 2. Nf3 Nf6 } e6 } 2. c4

then e6 is part of the game, } is an unexpected garbage character.
I understand, in theory. But in reality, if one doesn't handle that, then there are a few PGN games that will cause problems. At one point, ChessBase was a guilty party to producing such PGN very early on, supposedly. At least that was the claimed source for the first such game I saw.

I think from an esthetic perspective, matching { and } characters seems natural, so that a comment that has two opening { characters but only one closing } looks wrong superficially...
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: looking for a complex PGN file for testing PGN parser

Post by bob »

casaschi wrote:
bob wrote:
trojanfoe wrote: That is actually a very useful file Bob - I found several bugs in my PGN scanner/parser by testing against it. Many thanks!
as did I. Dann Corbit created the file years ago...
Thanks to all that replied.

I was hoping someone already put togheter a manually crafted list of things likely to break your PGN parser.

So far I was surprised by empty variations, like

Code: Select all

1. e4 e6 () 2. d4 d5
and by variations before the mainline, like

Code: Select all

&#40;1. d4 d5&#41; 1. e4 e6 2. d4 d5
Both illegal, I suppose, but you want your parser to cope as much as possible.

Unfortunately, the brute force of testing against enormous.pgn does not work for me, my PGN parser is in javascript and even with the best browser it's probably not a good idea to try a gigabyte of PGN :-)
You might be surprised. I can create a book from that file in way less than a minute... It is a rugged test for sure.