I was trying to write, by hand, a PGN parser, and quickly realized that it's hellishly more complicated than it looks. Already the lexer is heavily context sensitive...
Does anyone have a Lex and a Yacc file, so I don't start from scratch ?
Lex / Yacc for PGN
Moderator: Ras
-
lucasart
- Posts: 3243
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Lex / Yacc for PGN
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
Henk
- Posts: 7251
- Joined: Mon May 27, 2013 10:31 am
Re: Lex / Yacc for PGN
Of course I'm not answering your question.lucasart wrote:I was trying to write, by hand, a PGN parser, and quickly realized that it's hellishly more complicated than it looks. Already the lexer is heavily context sensitive...
Does anyone have a Lex and a Yacc file, so I don't start from scratch ?
But keep the scanner simple. Context should be handled in the parser not the scanner.
-
Jim Ablett
- Posts: 2391
- Joined: Fri Jul 14, 2006 7:56 am
- Location: London, England
- Full name: Jim Ablett
Re: Lex / Yacc for PGN
Gnuchess has the Lex stuff I think.lucasart wrote:I was trying to write, by hand, a PGN parser, and quickly realized that it's hellishly more complicated than it looks. Already the lexer is heavily context sensitive...
Does anyone have a Lex and a Yacc file, so I don't start from scratch ?
Jim.
-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Lex / Yacc for PGN
Yes, in Gnuchess 6.0.3 there is "src/frontend/lexpgn.l". It has some context references like Game[], GameCnt, MakeMove(), ValidateMove(), or ParseEPD(), and it includes a "common.h" file defining these functions/variables but also lots of other basic definitions like the whole Board data structure which are not related to PGN scanning/parsing, so that it can't be simply extracted and used in a different context without some modification (it was probably never meant to be). It should be possible, though, to create a modified version that has a minimal interface to the surrounding environment. E.g. accessing Game[] and GameCnt could be encapsulated through functions so that it would no longer be necessary to include a file like "common.h".Jim Ablett wrote:Gnuchess has the Lex stuff I think.lucasart wrote:I was trying to write, by hand, a PGN parser, and quickly realized that it's hellishly more complicated than it looks. Already the lexer is heavily context sensitive...
Does anyone have a Lex and a Yacc file, so I don't start from scratch ?
Jim.
Sven
-
Michel
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Lex / Yacc for PGN
If you want to recycle the pgn parser in GNU Chess it is probably best to use GNU Chess 5.50 as a base.
It doesn't have a common.h.
This is the interface as defined in pgn.h
It doesn't have a common.h.
This is the interface as defined in pgn.h
Code: Select all
void PGNSaveToFile (const char *file, game_t *game, const char *resultstr);
void PGNReadFromFile (game_t *game, const char *file);
void PGNIterInit(pgn_iter_t *pgn_iter);
int PGNIterStart(pgn_iter_t *pgn_iter, const char *file);
void PGNIterClose(pgn_iter_t *pgn_iter);
int PGNIterNext (pgn_iter_t *pgn_iter, game_t *game);-
Sven
- Posts: 4052
- Joined: Thu May 15, 2008 9:57 pm
- Location: Berlin, Germany
- Full name: Sven Schüle
Re: Lex / Yacc for PGN
Oops, sorry, I was looking up the 6.x version.Michel wrote:If you want to recycle the pgn parser in GNU Chess it is probably best to use GNU Chess 5.50 as a base.
It doesn't have a common.h.
Right, your 5.50 "lexpgn.l" has no "common.h" but instead it uses some other header files. At a first glance it seems to be a bit more decoupled from the engine, though.
Sven
-
Michel
- Posts: 2292
- Joined: Mon Sep 29, 2008 1:50 am
Re: Lex / Yacc for PGN
BTW. xboard also has a pgn parser and it does not depend on lex/yacc.
-
lucasart
- Posts: 3243
- Joined: Mon May 31, 2010 1:29 pm
- Full name: lucasart
Re: Lex / Yacc for PGN
Thanks for the help. I do realize that the GNU parsing code has some dependancy, which in turn bring more dependancies etc. I just wanted to have one as an example to follow, and as a kind of tutorial (so that when I write my own Lex file, and I ask myself, "how do you do this or that in Lex?" I have the answers in it). I'm sufficiently fluent in regular expressions, but still barely a Padawan in Lex & Yacc.
The Xboard PGN parser is surely a good place to start from too. I'll have a look, and if it's easy to extract from the rest, I'll take it. I'm just a bit worried that the Xboard source code, due to its generality (all the chess variants supported), may present a steep learning curve.
The third solution is to continue sweating on my crappy hand written lexer/parser, but in doing so I'm not really learning anything. At least Lex & Yacc are very powerful tools to learn, and can be reused in another context later.
The Xboard PGN parser is surely a good place to start from too. I'll have a look, and if it's easy to extract from the rest, I'll take it. I'm just a bit worried that the Xboard source code, due to its generality (all the chess variants supported), may present a steep learning curve.
The third solution is to continue sweating on my crappy hand written lexer/parser, but in doing so I'm not really learning anything. At least Lex & Yacc are very powerful tools to learn, and can be reused in another context later.
Theory and practice sometimes clash. And when that happens, theory loses. Every single time.
-
hgm
- Posts: 28452
- Joined: Fri Mar 10, 2006 10:06 am
- Location: Amsterdam
- Full name: H G Muller
Re: Lex / Yacc for PGN
Old XBoard versions used to have a lex-genereated parser (parser.l). I kicked it out, because it was difficult to maintain. So newer XBoard's contain a hand-written parser.c.
The support of variants does not hugely affect the parcer, as XBoard sues SAN in all variants. It just means that file ID is not necessary limited to a-h, and board ranks can be double digits, and piece ID can be any letter, rather than only PNBRQK. It does check if the square coordiantes and the piece is valid for the current variant. You would have to supply functions CharToPiece and PieceToChar to do this checking, and take care of the piece encoding the software you want to interface it uses.
More tricky thing is that it is dependent of XBoard's move-generation code, through the routines TestLegality and Disambiguate. Both of these generate all legal moves for a given position. TestLegality checks if a fully-specified input move is amongst those. Disambiguate does the same, but allows 'wild cards' for those items that were not specified in the move. It counts legal moves that match the items that were specified, and returns those in a fully specified move (and warns if there was more than one match). The XBoard versions of these routines (in the file moves.c) depend very much on XBoard's internal representation of Chess positions, which is likely very different from what you would want.
The support of variants does not hugely affect the parcer, as XBoard sues SAN in all variants. It just means that file ID is not necessary limited to a-h, and board ranks can be double digits, and piece ID can be any letter, rather than only PNBRQK. It does check if the square coordiantes and the piece is valid for the current variant. You would have to supply functions CharToPiece and PieceToChar to do this checking, and take care of the piece encoding the software you want to interface it uses.
More tricky thing is that it is dependent of XBoard's move-generation code, through the routines TestLegality and Disambiguate. Both of these generate all legal moves for a given position. TestLegality checks if a fully-specified input move is amongst those. Disambiguate does the same, but allows 'wild cards' for those items that were not specified in the move. It counts legal moves that match the items that were specified, and returns those in a fully specified move (and warns if there was more than one match). The XBoard versions of these routines (in the file moves.c) depend very much on XBoard's internal representation of Chess positions, which is likely very different from what you would want.