PGN Database updated

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

PGN Database updated

Post by Norm Pollock »

My "Grand" collection of pgn databases has been substantially updated. It is a free download. It features human-human long time-control games played over-the-board and separated into time periods. Filtered out (as best as possible based on information in the tags) are short time games, blindfold, correspondence, simultaneous, exhibition, and playoff games. Duplicate and near-duplicate games, and games with 25 moves or less were filtered out using utility programs. Player names have been consolidated where it seemed reasonable that two similar names were for the same player.

http://www.hoflink.com/~npollock/chess.html

-Norm
James Constance
Posts: 358
Joined: Wed Mar 08, 2006 8:36 pm
Location: UK

Re: PGN Database updated

Post by James Constance »

Nice work! Thanks.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: PGN Database updated

Post by sje »

I took a look at the GrandLE file. Symbolic's opening classifier detected 463 different common name openings (the Opening PGN tag) and 1,136 of the 10,000 possible SOC openings.
ArmyBridge

Re: PGN Database updated

Post by ArmyBridge »

Thanks a Lot Norman, very nice work 8-)
Regards
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: PGN Database updated

Post by Norm Pollock »

sje wrote:I took a look at the GrandLE file. Symbolic's opening classifier detected 463 different common name openings (the Opening PGN tag) and 1,136 of the 10,000 possible SOC openings.
Can you tell me more about the "Symbolic" opening classifier?

I don't exactly know what this means, and whether what you say is good or bad. But I can tell you GrandLe is a very small database. It only contains 6158 games from 1831 through 1890. And of course this is way before "modern" openings began.

The ECO codes in this file were re-determined by SCID and they use the extended SCID ECO code system (3 or 4 characters). SCID takes transpositions into account, whereas some others look at openings in a very literal way.
User avatar
sje
Posts: 4675
Joined: Mon Mar 13, 2006 7:43 pm

Re: PGN Database updated

Post by sje »

"SOC" is Standard Opening Code, a four digit opening classification system based on analysis of millions of high level games. It is not specific to any program, is not proprietary, and has numerous technical advantages over ECO and NIC. It does not include language specific names of openings although these could be added.

Features:

1) High resolution with 10,000 distinct variations.

2) Variations are lexicographically ordered by SAN sequence, so the translation dictionary is easily compressible while retaining fast access.

3) Easy to code look-up.

4) Each variation that contains subvariations has a precalculated index span that includes all subvariations. This makes the scheme one big happy move tree.

5) Can be used as an effective compressor for PGN game scores.

6) Move order is significant; no confusing transposition mapping.

7) Constructed automatically, so free from error and bias.

8) Data is freely available and can be freely hosted and posted. Just email me and the 448 KB ASCII table is on its way to you.

Sample:

Code: Select all

[0000:9999]
[0001:0033] Nc3
[0002:0004] Nc3 Nf6
[0003:0004] Nc3 Nf6 e4
[0004:0004] Nc3 Nf6 e4 e5
[0005:0012] Nc3 c5
[0006:0010] Nc3 c5 Nf3
[0007:0010] Nc3 c5 Nf3 Nc6
[0008:0010] Nc3 c5 Nf3 Nc6 d4
[0009:0010] Nc3 c5 Nf3 Nc6 d4 cxd4
[0010:0010] Nc3 c5 Nf3 Nc6 d4 cxd4 Nxd4
[0011:0012] Nc3 c5 e4
[0012:0012] Nc3 c5 e4 Nc6
[0013:0023] Nc3 d5
[0014:0015] Nc3 d5 d4
[0015:0015] Nc3 d5 d4 Nf6
[0016:0023] Nc3 d5 e4
[0017:0020] Nc3 d5 e4 d4
[0018:0020] Nc3 d5 e4 d4 Nce2
[0019:0020] Nc3 d5 e4 d4 Nce2 e5
[0020:0020] Nc3 d5 e4 d4 Nce2 e5 Ng3
[0021:0022] Nc3 d5 e4 dxe4
[0022:0022] Nc3 d5 e4 dxe4 Nxe4
[0023:0023] Nc3 d5 e4 e6
[0024:0030] Nc3 e5
[0025:0029] Nc3 e5 Nf3
[0026:0029] Nc3 e5 Nf3 Nc6
[0027:0029] Nc3 e5 Nf3 Nc6 d4
[0028:0029] Nc3 e5 Nf3 Nc6 d4 exd4
[0029:0029] Nc3 e5 Nf3 Nc6 d4 exd4 Nxd4
[0030:0030] Nc3 e5 e4
[0031:0031] Nc3 e6
[0032:0033] Nc3 g6
[0033:0033] Nc3 g6 e4
Another sample:

Code: Select all

[5814:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6
[5815:5816] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be2
[5816:5816] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be2 Bg7
[5817:5823] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3
[5818:5820] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Bg7
[5819:5820] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Bg7 Nc3
[5820:5820] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Bg7 Nc3 Nf6
[5821:5823] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Nf6
[5822:5823] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Nf6 Nc3
[5823:5823] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Be3 Nf6 Nc3 Bg7
[5824:5849] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3
[5825:5849] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7
[5826:5848] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3
[5827:5847] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6
[5828:5839] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4
[5829:5834] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O
[5830:5834] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O Bb3
[5831:5831] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O Bb3 a5
[5832:5834] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O Bb3 d6
[5833:5834] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O Bb3 d6 f3
[5834:5834] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 O-O Bb3 d6 f3 Bd7
[5835:5838] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 Qa5
[5836:5838] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 Qa5 O-O
[5837:5838] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 Qa5 O-O O-O
[5838:5838] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 Qa5 O-O O-O Bb3
[5839:5839] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Bc4 d6
[5840:5842] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Be2
[5841:5842] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Be2 O-O
[5842:5842] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Be2 O-O O-O
[5843:5845] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Nxc6
[5844:5845] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Nxc6 bxc6
[5845:5845] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 Nxc6 bxc6 e5
[5846:5847] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 f3
[5847:5847] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 Nf6 f3 O-O
[5848:5848] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Be3 d6
[5849:5849] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nc3 Bg7 Nb3
[5850:5852] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nxc6
[5851:5852] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nxc6 bxc6
[5852:5852] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 Nxc6 bxc6 Qd4
[5853:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4
[5854:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7
[5855:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3
[5856:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6
[5857:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3
[5858:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4
[5859:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4 Qxg4
[5860:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4 Qxg4 Nxd4
[5861:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4 Qxg4 Nxd4 Qd1
[5862:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4 Qxg4 Nxd4 Qd1 Ne6
[5863:5863] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 Ng4 Qxg4 Nxd4 Qd1 Ne6 Rc1
[5864:5869] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O
[5865:5869] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O Be2
[5866:5866] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O Be2 b6
[5867:5869] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O Be2 d6
[5868:5869] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O Be2 d6 O-O
[5869:5869] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 O-O Be2 d6 O-O Bd7
[5870:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 d6
[5871:5871] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Bg7 Be3 Nf6 Nc3 d6 Be2
[5872:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6
[5873:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3
[5874:5877] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 Nxd4
[5875:5877] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 Nxd4 Qxd4
[5876:5877] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 Nxd4 Qxd4 d6
[5877:5877] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 Nxd4 Qxd4 d6 Be2
[5878:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6
[5879:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6 Be2
[5880:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6 Be2 Nxd4
[5881:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6 Be2 Nxd4 Qxd4
[5882:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6 Be2 Nxd4 Qxd4 Bg7
[5883:5883] e4 c5 Nf3 Nc6 d4 cxd4 Nxd4 g6 c4 Nf6 Nc3 d6 Be2 Nxd4 Qxd4 Bg7 Bg5
Kohflote
Posts: 219
Joined: Wed Sep 19, 2007 11:07 am
Location: Singapore

Re: PGN Database updated

Post by Kohflote »

Dear Norman,

I am from Singapore but at the moment residing in China. I could not access to your given url in China. Is there an alternative, please?

Thank you.


Yours sincerely,

Kah Huat, Koh
Norm Pollock
Posts: 1056
Joined: Thu Mar 09, 2006 4:15 pm
Location: Long Island, NY, USA

Re: PGN Database updated

Post by Norm Pollock »

Kohflote wrote:Dear Norman,

I am from Singapore but at the moment residing in China. I could not access to your given url in China. Is there an alternative, please?

Thank you.


Yours sincerely,

Kah Huat, Koh
My url just goes to a links page. It is not the site where the download files are located. The primary download site is at www.orbitfiles.com and the mirror site is at www.zshare.com. Maybe at least one of these is not blocked. I will list direct links for one of the download files. Let's see if one of these links can get around the blockage.

http://www.orbitfiles.com/download/id3266741652.html
GrandLe (1M)

http://www.zshare.net/download/18678228a8370266/
mirror

Maybe a better idea. Use this link to download, but click on"Open" instead of "Save". This should give you a copy of my links page.

http://www.orbitfiles.com/download/id3342794487.html

From this page you can download the files.