SCID request bzip2

Discussion of chess software programming and technical issues.

Moderator: Ras

User avatar
jshriver
Posts: 1358
Joined: Wed Mar 08, 2006 9:41 pm
Location: Morgantown, WV, USA

SCID request bzip2

Post by jshriver »

I see scid supports gzip'd pgn files and such, but anyone know the author and can request bzip2 support?

Since pgn is a text format, bzip2 -9 is a much tighter compression format for text than gzip and open source. So should be fairly easy.

Thought I'd post :) if I get desperate might fork and try myself and upload diffs. I'm working with the marcelk dataset and others which primarily use bzip2 so would prefer to keep the data that way to save space.

-Josh
Ron Murawski
Posts: 397
Joined: Sun Oct 29, 2006 4:38 am
Location: Schenectady, NY

Re: SCID request bzip2

Post by Ron Murawski »

I've always gotten better compression on PGNs using open source 7-Zip with the PPMD option. I wrote an article about it a couple of years ago:
http://www.horizonchess.com/pmwiki.php? ... ompression

Maybe BZip2 compression has improved since 2006?
stevenaaus
Posts: 613
Joined: Wed Oct 13, 2010 9:44 am
Location: Australia

Re: SCID request bzip2

Post by stevenaaus »

I'd never realised bzip2 was so much better for text files. :shock: I always wondered why the fuss over bzip2, but a quick test shows it's quite good.

Coding this feature isn't really my thing though, and imho .zip file support would be more useful. ... I'd probably add bzip2 support if someone coded it (and definitely .zip ;>). And I'd ~guess~ mainline SCID would add the feature if sent as a patch.

You can find the relevant interfaces in tcl/file.tcl where you simply have to add the .bz2 suffix (and preferably detect it via src/tkscid.cpp::sc_info)

Code: Select all

  if {$fName == ""} {
    if {[sc_info gzip]} {
      set ftype {
        { "Scid databases, PGN files" {".si4" ".si3" ".pgn" ".PGN" ".pgn.gz"} }
        { "Scid databases" {".si4" ".si3"} }
        { "PGN files" {".pgn" ".PGN" ".pgn.gz"} }
      }
    } else {
      set ftype {
        { "Scid databases, PGN files" {".si4" ".si3" ".pgn" ".PGN"} }
        { "Scid databases" {".si4" ".si3"} }
        { "PGN files" {".pgn" ".PGN"} }
      }
    }
    set fName [tk_getOpenFile -initialdir $::initialDir(base) -filetypes $ftype -title "Open a Scid file"]
and src/mfile.cpp where the file is opened and decompression routine called

Code: Select all

    const char * suffix = strFileSuffix (name);
    if (suffix != NULL  &&  strEqual (suffix, GZIP_SUFFIX)) {
        // We can only open GZip files read-only for now:
        if (fmode != FMODE_ReadOnly) {
            return ERROR_FileOpen;
        }
#ifdef WINCE
        GzHandle = gzopen (name, "r");
#else
        GzHandle = gzopen (name, "rb");
#endif

        if (GzHandle == NULL) { return ERROR_FileOpen; }
        Type = MFILE_GZIP;
I think this is right.....