new junkbase
Moderator: Ras
-
- Posts: 1260
- Joined: Sat Dec 13, 2008 7:00 pm
Re: Optimal compression
I'm almost certain that the last time I tried this it was way more efficient to zip up the SCID files than to zip up the PGN.
-
- Posts: 12791
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Optimal compression
Besides, the files are available both ways (bzip2 compressed SCID files and bzip2 compressed PGN.)Gian-Carlo Pascutto wrote:I'm almost certain that the last time I tried this it was way more efficient to zip up the SCID files than to zip up the PGN.
Code: Select all
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\scid
05/20/2009 01:05 AM 581,821,018 jbase.sg3.bz2
05/20/2009 01:03 AM 208,961,026 jbase.si3.bz2
05/20/2009 01:01 AM 6,394,724 jbase.sn3.bz2
3 File(s) 797,176,768 bytes
Code: Select all
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\a-openings
05/26/2009 06:50 PM 998,832 A00.pgn.bz2
...
05/26/2009 06:55 PM 211,656 A99.pgn.bz2
542 File(s) 273,578,222 bytes
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\b-openings
05/26/2009 06:55 PM 45,867 B00$01.pgn.bz2
...
05/26/2009 07:04 PM 9,599 B99y.pgn.bz2
1177 File(s) 391,056,381 bytes
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\c-openings
05/26/2009 07:04 PM 258,702 C00$11.pgn.bz2
...
05/26/2009 07:11 PM 667,610 C99.pgn.bz2
1085 File(s) 232,974,751 bytes
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\d-openings
05/26/2009 07:11 PM 20,923 D00$05.pgn.bz2
...
05/26/2009 07:17 PM 310,115 D99.pgn.bz2
932 File(s) 225,090,299 bytes
Directory of C:\dannfast\e_drive\ward-ftp\FTPRoot\pub\e-openings
05/26/2009 07:17 PM 255,175 E00.pgn.bz2
...
05/26/2009 07:20 PM 1,759,921 E99.pgn.bz2
558 File(s) 177,728,416 bytes
A total of 1300 MB are consumed for compressed PGN files.
So if you already use SCID, then get the SCID files. The download will be faster and you can turn it into PGN if you like. If you don't use SCID, then get the PGN files.
Re: Optimal compression
I made a test. I compressed the 3 files jbase.sg3 *.sn3 and *.si3 with NanoZip and ended with a 631 MB file instead of 760 MB. It lasted 1 hour.
-
- Posts: 6081
- Joined: Fri Mar 10, 2006 11:14 pm
- Location: Munster, Nuremberg, Princeton
Re: new junkbase
Third part of my test:
The top program with the most games in the junk is...
SHREDDER vers. 8!!!
With 176000 games approx.
Also Booot or Bright have a huge part of games. Yes, Fritz 8 too or the deep version.
Forget about RYBKA. It's participating with some 10000 with some versions.
---------------------------------------------------
So besides the title of junk, we can conclude that it's biased junk.
It's comparable to the long tradition in the CB supporting forum CSS in Germany where for years Fritz and others won all "private" tournaments while Rybka wasnt even mentioned.
It's so sad to see.
Selfunderstood that I already took care that no games out of my career should be in that junkbase. Sorry.
The top program with the most games in the junk is...
SHREDDER vers. 8!!!
With 176000 games approx.
Also Booot or Bright have a huge part of games. Yes, Fritz 8 too or the deep version.
Forget about RYBKA. It's participating with some 10000 with some versions.
---------------------------------------------------
So besides the title of junk, we can conclude that it's biased junk.
It's comparable to the long tradition in the CB supporting forum CSS in Germany where for years Fritz and others won all "private" tournaments while Rybka wasnt even mentioned.
It's so sad to see.
Selfunderstood that I already took care that no games out of my career should be in that junkbase. Sorry.
-Popper and Lakatos are good but I'm stuck on Leibowitz
-
- Posts: 12791
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: new junkbase
There are 81392 Zappa games
THere are 22065 ZapChess games
There are 156677 Naum games
There are 215323 Rybka games
There are 569977 Shredder games
There are 670012 Fritz games
I think that the counts are mostly a function of how long the programs have been around.
But the biggest problem with the database is the inconsistency in naming of the players (IMO).
For instance, there are 6545 distinct players with the substring 'shredder' in them.
THere are 22065 ZapChess games
There are 156677 Naum games
There are 215323 Rybka games
There are 569977 Shredder games
There are 670012 Fritz games
I think that the counts are mostly a function of how long the programs have been around.
But the biggest problem with the database is the inconsistency in naming of the players (IMO).
For instance, there are 6545 distinct players with the substring 'shredder' in them.