Jose is slow with 1/2 million games.Christopher Conkie wrote:Would it be possible to use SQL like Jose?Dann Corbit wrote:With 10 M games cobbled together by me, there is really no chance that I will find time to clean it up properly. Perhaps someone else will do it.Edmund wrote:The question really is, what the data should be used for ..Dann Corbit wrote:... The collection has actually grown so large now that there are not really any tools that handle it well. ChessAssistant, ChessBase, Scid... All of them die if I feed the whole pile to them and ask the tool to do something useful. So I am not sure how you can fully utilize the data, but have fun trying....
if you want to query games of a certain player or of a certain tournament, then the scid format is great. But for this case the database (jbase) could be cleaned out a lot. Eg I find a couple of games of the following type:
This looks more like some general instructions for opening books to me and have no value in a games database.Code: Select all
[Event "?"] [Site "?"] [Date "????.??.??"] [Round "?"] [White "?"] [Black "?"] [Result "0-1"] [ECO "A00h"] [Variation "Durkin"] [Annotator ""] [Source ""] [Remark ""] 1. Na3 g5 2. Nc4 0-1
However, if you want to use the database as a foundation for answering questions like, what have players played in this position before, I would rather suggest to transfer the database into another type of structure. That is either tree based or position based. The first being probably the most compact way of storing the database (and that without any loss of data), while the position based version catches transpositions and is also able to find positions similar to the current, but in exchange also requires more space and it looses some information about the games.
I am thinking about writing my own database interface. I can't think of any other way to get what I want.