SCID5 trials: rocksdb

Fulvio · Post by **Fulvio** » Sun Jan 09, 2022 2:46 pm

Rocksdb is a C++ library that implements a database made only of key-value records.
I did some testing to see if it can be used as a successor of the SCID4 format.

For the test I used the 2021-11 lichess database:
https://database.lichess.org/standard/l ... 11.pgn.bz2

The downloaded file (lichess_db_standard_rated_2021-11.pgn.bz2) is 22.1GB and the decompressed PGN is 187GB.
It contains over 87 million games.

Converting the database to the new rocksdb format took 34 minutes.
The maximum game limit of the new format is 4 billion.
The disk usage (Samsung SSD 970 evo plus) during the conversion never went above 30%.
During the positional search however it reached 100% (200MB/s), probably because rocksdb by default use 16 threads.
The peak memory usage was 13GB.
The final database size is 32.3GB (22.5GB compressed with 7zip).

Opening the converted database takes 6 minutes and 20 seconds.
The time to search for a position is highly variable: it ranges from 12 seconds to 144 seconds.

The SCID4 format cannot hold that many games, so I used the first 3 million games to make a performance comparison.
The 3 million games PGN file size is 5.87GB (912MB compressed with 7zip).

Converted to SCID4 format:
- size: 3.92GB (554MB compressed with 7zip)
- database opening: 0.3 seconds
- search for a position: ranges from 0.034 seconds to 0.704 seconds.
The best performance with the SCID4 format is achieved by removing all comments and extra tags and then compacting the database.
Best result for SCID4 position search: ranges from 0.021 seconds to 0.115 seconds.

The best results I've been able to get with rocksdb are:
- size: 1.05GB (770MB compressed with 7zip)
- database opening: 1.250 seconds
- search for a position: ranges from 0.034 seconds to 0.918 seconds
(https://github.com/benini/scid/blob/d84 ... esults.txt)

There are many parameters to optimize rocksdb and I have tried to tune some of them.
The code and the various commit are on the rocksdb branch in my github repository:
https://github.com/benini/scid/blob/d84 ... sdb.h#L201

If there is any rocksdb expert, advice on how to better optimize it are very welcome.

Fulvio · Post by **Fulvio** » Sun Jan 16, 2022 7:05 pm

Fulvio wrote: ↑Sun Jan 09, 2022 2:46 pm Opening the converted database takes 6 minutes and 20 seconds.

The time it took to open the database seemed very long and I decided to investigate.
The results are quite interesting.
In the rocksdb database the tag values were stored as:
key: unsigned 32-bit ID
value: string

Two functions are necessary for tags:
1) get_tags (gameID) -> list of Tag pairs
Returns the tags associated with a game, for example:
get_tags (1) ->
"White": valueID
"WhiteTitle": "GM"

2) get_value (valueID) -> string
For some values that are repeated in different games, such as player names, we assign a 32-bit valueID to the string and this function is used to retrieve the value, for example
get_value (1) -> "Carlsen"

There is also another fundamental function:
3) get_ gameIDs_sorted_by (Tag, n_results) -> list of gameIDs
which returns the games sorted according to the values of a tag.
It is used for example to obtain the list of games sorted by date, elo, white player, etc...
SCID uses the two functions above to create an index in memory and implement this function efficiently.

The values in rocksdb were stored by valueID, and everything seemed ok.

However, there is another function that is handy:
4) get_values (TagID, prefix, n_results) -> list of strings
It is used to suggest the value of a tag when saving a game.
For example if the user starts typing "Carl" for the White tag we want to suggest the first values starting with "Carl". To get those values with just the above functions it would be necessary to scan the whole database.
So, only for the main tags (White, Black, Event, Site, Round) a memory index is created when the database is opened. In the index the records are ordered by tag value:
key: const char * tag_value
valueID: unsigned 32-bit ID

Profiling the program, I was surprised to find that over 80% of the time it took to open the database was spent building that index.
The problem is that, in the lichess database, for each game there is a unique tag:
Site: https://lichess.org/pAwbu5vT
In the database there are over 87 million games, a record in the index takes up 12 bytes, and the Site's index becomes over 1GB: a lot of time was spent sorting all those tags!

I tried storing in the rocksdb the tags sorted by value and the time to open the database is significantly reduced to less than a minute.

JohnS · Post by **JohnS** » Mon Jan 17, 2022 5:56 am

Fulvio wrote: ↑Sun Jan 16, 2022 7:05 pm
Profiling the program, I was surprised to find that over 80% of the time it took to open the database was spent building that index.
The problem is that, in the lichess database, for each game there is a unique tag:
Site: https://lichess.org/pAwbu5vT
In the database there are over 87 million games, a record in the index takes up 12 bytes, and the Site's index becomes over 1GB: a lot of time was spent sorting all those tags!

I tried storing in the rocksdb the tags sorted by value and the time to open the database is significantly reduced to less than a minute.

This is all due to the crazy lichess pgn format. I like the simple suggestion from https://www.hiarcs.net/forums/viewtopic ... 3&start=15 to change the "Site" tag to "Source" to deal with this artificial problem.

Ozymandias · Post by **Ozymandias** » Mon Jan 17, 2022 7:38 am

Or use Norm Pollock's tools to delete/clean tags. I suppose pgn-extract could also be used.

Fulvio · Post by **Fulvio** » Mon Jan 17, 2022 8:54 am

JohnS wrote: ↑Mon Jan 17, 2022 5:56 am This is all due to the crazy lichess pgn format. I like the simple suggestion from https://www.hiarcs.net/forums/viewtopic ... 3&start=15 to change the "Site" tag to "Source" to deal with this artificial problem.

Thanks for the link, it's a very interesting discussion.

I don't know if lichess can be blamed.
If I'm not mistaken, the developer of HCE is the one who started ChessX, which in turn is derived from SCID.
In that forum Rudolf wrote: "There are four tables: Games, Players, Events and Sites. In games table tags White, Black, Event and Site are stored as indices to Players, Events and Sites "
This is the original sin, apparently still used in all the various versions.

There is another interesting thing that I noticed while experimenting with the various compression algorithms in rocksdb.
Intuitively we expect to get a smaller database, in exchange for higher CPU usage and therefore slower read times.
It is true for example using zstd.
But decompression with lz4 is blazingly fast.
So fast that reading compressed data + decompressing is faster than reading the data already uncompressed!

SCID5 trials: rocksdb

SCID5 trials: rocksdb

Re: SCID5 trials: rocksdb

Re: SCID5 trials: rocksdb

Re: SCID5 trials: rocksdb

Re: SCID5 trials: rocksdb