Help with pgn-extract

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Help with pgn-extract

Post by asanjuan »

I'm looking for the way to filter the games that ended with the comment:
"Draw by fifty moves rule"

That kind of end games are some kind of "garbage" for the tunning algorithm.

I can't find how to filter these games from a large pgn file.

Any idea? I can't find the proper option in the help. Or Is there any other tool?

Thanks in advance.
Still learning how to play chess...
knigths move in "L" shape ¿right?
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Help with pgn-extract

Post by Adam Hair »

asanjuan wrote:I'm looking for the way to filter the games that ended with the comment:
"Draw by fifty moves rule"

That kind of end games are some kind of "garbage" for the tunning algorithm.

I can't find how to filter these games from a large pgn file.

Any idea? I can't find the proper option in the help. Or Is there any other tool?

Thanks in advance.
I recommend Scid vs PC or any other branch of Scid. You can definitely filter out games that contain that comment (or any other comment) in the PGN body.
asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Re: Help with pgn-extract

Post by asanjuan »

Ok. Scid. Thanks.
Still learning how to play chess...
knigths move in "L" shape ¿right?
asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Re: Help with pgn-extract

Post by asanjuan »

asanjuan wrote:Ok. Scid. Thanks.
Finally I couldn't filter the games using scid. The evaluation and the outcome is stored as a comment in the pgn.
After filtering, it doesn't show any game at all.

What I want is to filter games that have a very concrete comment and the evaluation is above a score. For example:

{+2.35/13 0.062s, Draw by fifty moves rule }

It is clear that my evaluation is missing something here.

I can find it by searching using a text editor, but the editor shows that there are 2110 matches with "Draw by fifty moves rule", since I can't filter for a specific evaluation score.

Maybe using a regular expression?

Anyone has solved this before? There must be someone.
Still learning how to play chess...
knigths move in "L" shape ¿right?
Ferdy
Posts: 4833
Joined: Sun Aug 10, 2008 3:15 pm
Location: Philippines

Re: Help with pgn-extract

Post by Ferdy »

asanjuan wrote:
asanjuan wrote:Ok. Scid. Thanks.
Finally I couldn't filter the games using scid. The evaluation and the outcome is stored as a comment in the pgn.
After filtering, it doesn't show any game at all.

What I want is to filter games that have a very concrete comment and the evaluation is above a score. For example:

{+2.35/13 0.062s, Draw by fifty moves rule }

It is clear that my evaluation is missing something here.

I can find it by searching using a text editor, but the editor shows that there are 2110 matches with "Draw by fifty moves rule", since I can't filter for a specific evaluation score.

Maybe using a regular expression?

Anyone has solved this before? There must be someone.
For regex, to match the {+, or {-, use,

Code: Select all

\{\W
Since you are interested only on non-zero score, there should be + or - after the {. So for your example above you can use,

Code: Select all

\{\W\d*\.\d*/\d*\s\d*\.\d*s,\sDraw by fifty moves rule\W*\}
I use notepad++ to search this pattern and it is fine. I use the pgn generated by cutechess-cli v0.5.1 and v0.6.0.

I don't have this anomalies in Deuterium so far - non-zero score in draw by fifty :) .
asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Re: Help with pgn-extract

Post by asanjuan »

Thanks for the hint.
I don't have this anomalies in Deuterium so far - non-zero score in draw by fifty :) .
Rhetoric doesn't have any knowledge about the 50 move rule. I choose to not to implement it time ago, just because there are rook endgames that takes more moves to force a win.

But this is another subject. The point is that, if my engine is scoring a +2 or a +3, or even a +7 as I've seen, it must be because there is a serious advantage for one side, but is failing to find a winning path, surely because it is not able to transform the position into a simpler endgame. Then, it makes dumb moves until it reaches the 50 move rule.
This is an evaluation issue that I want to solve.

If it had the 50 move rule implemented, then the game would last even more, and the problem would be still there.

At the same time, if I keep this positions in the learning set of positions for the tunning algorithm, Rhetoric can learn wrong positional values, just because the outcome of the game is very far from the current evaluation, and so is noise.

Now that I have adopted your regex expression, the work is easier.

Thanks a lot, Ferdinand.
Still learning how to play chess...
knigths move in "L" shape ¿right?
asanjuan
Posts: 214
Joined: Thu Sep 01, 2011 5:38 pm
Location: Seville, Spain

Re: Help with pgn-extract

Post by asanjuan »

asanjuan wrote:Thanks for the hint.
I don't have this anomalies in Deuterium so far - non-zero score in draw by fifty :) .
Rhetoric doesn't have any knowledge about the 50 move rule. I choose to not to implement it time ago, just because there are rook endgames that takes more moves to force a win.

But this is another subject. The point is that, if my engine is scoring a +2 or a +3, or even a +7 as I've seen, it must be because there is a serious advantage for one side, but is failing to find a winning path, surely because it is not able to transform the position into a simpler endgame. Then, it makes dumb moves until it reaches the 50 move rule.
This is an evaluation issue that I want to solve.

If it had the 50 move rule implemented, then the game would last even more, and the problem would be still there.

At the same time, if I keep this positions in the learning set of positions for the tunning algorithm, Rhetoric can learn wrong positional values, just because the outcome of the game is very far from the current evaluation, and so is noise.

Now that I have adopted your regex expression, the work is easier.

Thanks a lot, Ferdinand.
Now tunning again with the new game sample. Let's see if I can release an updated version.
Almost every parameter is changing...
:D
Still learning how to play chess...
knigths move in "L" shape ¿right?