WTB PGN parser test suite

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

markboylan
Posts: 4242
Joined: Wed Mar 08, 2006 9:48 pm
Location: The Twilight Zone

WTB PGN parser test suite

Post by markboylan »

Hi everyone!

I'm trying to make my parser as forgiving as possible and still be right (and fast). So, I'm looking for big PGN files with lots of poorly recorded games -- preferably generated by an assortment of different programs (or people).

If you have any large-ish (the bigger the better) PGN files that push the correctness envelope of the PGN input specification, or know where I can find any, please let me know.

Thanks!
There's a fine line between a post and a signature.
Rémi Coulom
Posts: 438
Joined: Mon Apr 24, 2006 8:06 pm

Re: WTB PGN parser test suite

Post by Rémi Coulom »

This is a conforming game record that I wrote to test my own parser:

Code: Select all

[Event "This is an event with a \""]
[Site "This site has \\\\ xxx"]
[Date "1996.08.15"][Round "-"][White "White"][Black "Black"]
[Result "1/2-1/2"][SetUp "1"];I love comments
[FEN "1nbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/1NBQKBNR w KQkq - "]e4(d4;comment
d5!!2.c4$50(Nf3?))e5 Nf3{Comment !}Nc6 Nc3 Nf6 Bc4 Bc5 O-O O-O
% This is a line with an escape character !
@@@æææ {unexpected characters are skipped}
<> &#123;These symbols are reserved&#125;
1/2-1/2
Rémi
jdart
Posts: 4367
Joined: Fri Mar 10, 2006 5:23 am
Location: http://www.arasanchess.org

Re: WTB PGN parser test suite

Post by jdart »

Unfortunately there is a plethora of badly recorded and/or mangled game data. You could start with

ftp://ftp.pitt.edu/group/student-activities/chess/PGN/

or there is Dann Corbit's huge PGN collection at:

http://cap.connx.com/

although I think most of those are not as bad.

If I remember right Rybka Aquarium has an interesting take on PGN in its database - most notably with a non-standard date format.

--Jon
Dann Corbit
Posts: 12542
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: WTB PGN parser test suite

Post by Dann Corbit »

Almost for sure, there's something in here to bust your parser:
http://cap.connx.com/chess-engines/new- ... ns.pgn.bz2

Everything in the file is valid, also. So you should be able to read it without any errors.
markboylan
Posts: 4242
Joined: Wed Mar 08, 2006 9:48 pm
Location: The Twilight Zone

Re: WTB PGN parser test suite

Post by markboylan »

Thanks guys! These files helped.

Looks like I need to accept nested quotes :p
There's a fine line between a post and a signature.
User avatar
jshriver
Posts: 1342
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

Re: WTB PGN parser test suite

Post by jshriver »

Anyone know what the rules are considering unicode in PGN?

Noticed while looking at Dan's annotations.pgn it had some unicode characters in it.