Shredder FEN

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

MrEdCollins wrote:Hi Adam,

Forgive me if I don't quite understand your problem.

You have a PGN file, a database of many PGN Fischer Randon games, and you are interested in knowing how often each side castles on the same side, on opposite sides, etc. Yes?

A year or two ago I wrote a little utility for myself called "PGN Cleanup" which, among other things, gives a bunch of statistics on the file it processes. And yes, one of those statistics are a few castling statistics.

For example, below is the partial output from a PGN file that I downloaded and then processed:


Win / Loss Statistics:
----------------------
Number of games White won: 20,306 (38.4%)
Number of games Black won: 15,628 (29.6%)
Number of draws: 16,924 (32.0%)

Castling Statistics:
--------------------
White castled Kingside and Black castled Kingside: 35,932 (68.0%)
White castled Kingside and Black castled Queenside: 1,878 (3.6%)
White castled Kingside and Black did not castle: 3,920 (7.4%)

White castled Queenside and Black castled Kingside: 3,566 (6.7%)
White castled Queenside and Black castled Queenside: 1,020 (1.9%)
White castled Queenside and Black did not castle: 1,504 (2.8%)

White did not castle and Black castled Kingside: 2,694 (5.1%)
White did not castle and Black castled Queenside: 568 (1.1%)
Neither player castled: 1,774 (3.4%)



Even though your file is a file of FR games, with a Shredder FEN tag to distinguish the opening position, the castling notation within the PGN remains the same, does it not? O-O for Kingside Castling and O-O-O for Queenside castling? If so, my program wouldn't care if the games were FR games or normal games... it would generate a similar set of statistics for your file.
Hi Ed,

You understand my question. And yes, the castling notation in the PGNs is O-O and O-O-O. So they should be compatible with your program. Is there a link to the utility on your chess page?

Adam
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

stevenaaus wrote:
Adam Hair wrote:Unfortunately, the information required is about castling.

Scid and PGN Extract do not understand the rook location token(?) (HAha, HBhb, ... GAga, ..., etcetera).
There is a scid960 project. Unfortunately the last i heard from the person was that he had found bugs, and was going to do some more work on it, but have had no follow up.

I have a working patch for Scid vs PC, if anyone's interesting in testing it and telling me what works and what doesn't. Perhaps we can make it work properly ? Shredder FEN seem to be working ok.
At first glance, it appears that scid960 is working correctly in regards to extracting the castling information.

I am willing to test your patch to Scid vs PC, especially since you are the current active developer for Scid.

Adam
MrEdCollins
Posts: 59
Joined: Tue May 03, 2011 12:12 am
Location: Southern California

Re: Shredder FEN

Post by MrEdCollins »

Adam Hair wrote: Hi Ed,

You understand my question. And yes, the castling notation in the PGNs is O-O and O-O-O. So they should be compatible with your program. Is there a link to the utility on your chess page?

Adam
No, I never release the utility to the public. In fact, I didn't write it with anyone else in mind. It was simply something for me and for my own personal use. When I run it I run it from the compiler GUI that I wrote it from, specifying right inside the source code what file to load, and the output file name, etc.

I suppose I COULD take the time to make it so that anyone could use it, but right now I don't have the time to do that. But if you wish to e-mail me your PGN file, I won't have a problem in processing it for you.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

MrEdCollins wrote:
Adam Hair wrote: Hi Ed,

You understand my question. And yes, the castling notation in the PGNs is O-O and O-O-O. So they should be compatible with your program. Is there a link to the utility on your chess page?

Adam
No, I never release the utility to the public. In fact, I didn't write it with anyone else in mind. It was simply something for me and for my own personal use. When I run it I run it from the compiler GUI that I wrote it from, specifying right inside the source code what file to load, and the output file name, etc.

I suppose I COULD take the time to make it so that anyone could use it, but right now I don't have the time to do that. But if you wish to e-mail me your PGN file, I won't have a problem in processing it for you.
The file is slightly too big to send by email. So I uploaded it to Mediafire: http://www.mediafire.com/?hf18gvvm8833nj8

If you prefer that I send it by email, let me know and I will trim unnecessary pgn tags to get it under 25 MB (so that Gmail will let me send it).
MrEdCollins
Posts: 59
Joined: Tue May 03, 2011 12:12 am
Location: Southern California

Re: Shredder FEN

Post by MrEdCollins »

I had no problems in downloading it or processing it.

I hope this information is of some use to you:

The original input file contains 2,862,901 lines of text.
The original input file contains 107,100 games.
The new 'bad' output file contains 0 bad games.

The following statistics are for the new, cleaned up PGN output file:
The new output file contains 2,569,915 lines of text.
The new output file contains 107,100 games.

Win / Loss Statistics:
----------------------
Number of games White won: 45,026 (42.0%)
Number of games Black won: 40,394 (37.7%)
Number of draws: 21,680 (20.2%)

Castling Statistics:
--------------------
White castled Kingside and Black castled Kingside: 26,787 (25.0%)
White castled Kingside and Black castled Queenside: 6,088 (5.7%)
White castled Kingside and Black did not castle: 12,679 (11.8%)

White castled Queenside and Black castled Kingside: 5,577 (5.2%)
White castled Queenside and Black castled Queenside: 11,422 (10.7%)
White castled Queenside and Black did not castle: 10,142 (9.5%)

White did not castle and Black castled Kingside: 10,735 (10.0%)
White did not castle and Black castled Queenside: 9,030 (8.4%)
Neither player castled: 14,640 (13.7%)
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

Thank you, Ed. It does not answer every question about the FRC database that has come up recently, but it is more than what we had :)
stevenaaus
Posts: 608
Joined: Wed Oct 13, 2010 9:44 am
Location: Australia

Re: Shredder FEN

Post by stevenaaus »

Adam Hair wrote:
stevenaaus wrote:
Adam Hair wrote:Unfortunately, the information required is about castling.

Scid and PGN Extract do not understand the rook location token(?) (HAha, HBhb, ... GAga, ..., etcetera).
There is a scid960 project. Unfortunately the last i heard from the person was that he had found bugs, and was going to do some more work on it, but have had no follow up.

I have a working patch for Scid vs PC, if anyone's interesting in testing it and telling me what works and what doesn't. Perhaps we can make it work properly ? Shredder FEN seem to be working ok.
At first glance, it appears that scid960 is working correctly in regards to extracting the castling information.

I am willing to test your patch to Scid vs PC, especially since you are the current active developer for Scid.

Adam
It'd be nice to have this working. Could you please test with scid960 and tell me what doesn't work ?

I've contacted Ben to see how he's going, but i think he's hit a little technical hurdle which i'll document here.

This standard fen

Code: Select all

rn1qnk2/pp2p2P/3p4/2pp4/8/2P5/P1P2PP1/R1BbKB1R w KQ - 0 16
for which the best move is h8=Q+ (or h7h8q UCI), stuffs up the anaylsis widget for UCI engines. You will see an apparent ok line , but this is garbage and not what the engine is sending (have a look at the engine log). Xboard engines also work fine and show the queening move. [EDIT; actually - Crafty fails badly with scid960, but works fine with scidvspc960. Other Xboard engines seem fine in both GUIs]
The issue is internal to Scid960, and i'm examining the best solution.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

There is a discrepancy between the numbers that Ed's program produced and Scid960 produced for the castling information. I expected a small bit of difference. From the questions that Ed asked me, I assume his program locates 'O-O' and 'O-O-O' in the PGN and so this should give the exact number of castling moves. I expected Scid960 to give slightly higher numbers due to the fact that Scid looks for positions, not moves. However, it appears that Scid960 is missing some positions. For example, Scid960 finds 4530 positions that correspond to White castling O-O and Black castling O-O-O. Ed Collins' program finds that it occurred 6088 times.

I have done a bit of investigating, but I have not found an example of Scid960 missing a castle as of yet.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Shredder FEN

Post by hgm »

Just to make sure I understand how database users tackle such questions: What you do is put a white King on g1 and a Rook on f1, and then look for a position that matches that in a mode where there can be additional material? And then repeat that with a black King on c8 and a black Rook on d8, to narrow the search?

WinBoard nearly does that, except that you cannot narrow a search, so you cannot search for two positions at the same time. This does seem an enhancement that is generally useful, though. Perhaps I should make a checkbox 'narrow' that determines whether a search request starts with the full list, or only searches the current selection. Or have two separate buttons, 'find' and 'narrow'. Or I could always let it use 'narrow' mode, and only reset to the full database when you close and reopen the Game List window.
Adam Hair
Posts: 3226
Joined: Wed May 06, 2009 10:31 pm
Location: Fuquay-Varina, North Carolina

Re: Shredder FEN

Post by Adam Hair »

hgm wrote:Just to make sure I understand how database users tackle such questions: What you do is put a white King on g1 and a Rook on f1, and then look for a position that matches that in a mode where there can be additional material? And then repeat that with a black King on c8 and a black Rook on d8, to narrow the search?
I am an idiot. Your question made me realize that the result of my search in my previous post was the number of games that had White at Kg1 and Rf1 and Black at Kc8 and Rd8. No wonder that the number found by Scid960 is too low. The actual number found by Scid960 would be 7419 (which is a little higher than the real number of games where White castled O-O and Black castled O-O-O).

So yes, your description is most likely how a competent database user would conduct that search.
hgm wrote:WinBoard nearly does that, except that you cannot narrow a search, so you cannot search for two positions at the same time. This does seem an enhancement that is generally useful, though. Perhaps I should make a checkbox 'narrow' that determines whether a search request starts with the full list, or only searches the current selection. Or have two separate buttons, 'find' and 'narrow'. Or I could always let it use 'narrow' mode, and only reset to the full database when you close and reopen the Game List window.
'narrow' would be an excellent feature. Would the statistics for the total number of games containing the current position be available also?