SCID4 database

Discussion of chess software programming and technical issues.

Moderator: Ras

dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: SCID4 database

Post by dangi12012 »

As I told you - with SQL you get O(1) lookup performance on primary keys. You dont have to take it from me - literally anyone working in IT industry with some SQL experience can verify.

You can even have your database explain what it would do in terms of performance to optimize performance: https://www.sqlite.org/eqp.html

Looking this stuff up will cost you more than the 10 minutes it takes to define two tables and try it out yourself ;)
But creating some SCID4 binary schema yourself can be also a very satisfying - I wont post here anymore. You do your thing.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
glav
Posts: 60
Joined: Sun Apr 07, 2019 1:10 am
Full name: Giovanni Lavorgna

Re: SCID4 database

Post by glav »

Hi Fulvio. First of all I would like to thank you and your team for all the hard work you have put into scid development in all these years. I use the program quite often and I love it.
As manys user will also tell you, search after game indexing is already very fast and other improvements in this direction, though wellcome, would go probably vastly unannoticed. Along the same line, a more compact encoding, though helpful, would just be the cherry on the cake. Instead, one thing that could really make the difference in the present scenario for scid users is deep position analysis, say a la IDEA in the Aquarium package. Do you envision the possibility to implement such a feature later on? Thanks and sorry for hijacking your thread.
User avatar
phhnguyen
Posts: 1524
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: SCID4 database

Post by phhnguyen »

Sorry for jumping into your posts. Some people requested me to support SCID databases thus it is on my to-do list for a while - your topic will help me in the coming time. Thanks!

Since any binary data is always hard to understand and support by other apps. The situation is worse since we have many different formats for chess databases. I wish we have an open standard for chess game databases thus we could exchange data easily between chess GUIs and other applications.
Fulvio wrote: Wed Oct 06, 2021 12:14 pm And the main reason why a classic SQL database is not used is to obtain better performance.
dangi12012 wrote: Thu Oct 07, 2021 8:01 pm This looks a lot like what would normally be inside list of pgn files and you need the 4 relational tables from that.
Why not have a normal relational database in sqlite?

The language bindings are there for C++. C#, Java and every Language you can think of. That way any language could read the project file and interact with it.

As you said 20 years ago - there was little notion of open source and every developer had to come up with something on his own.
IMO sqlite has performance - is tested in enterprise software - and has good example code available?

When size is an issue you cann still store compressed blobs. Either way performance will be very good.

What would be your motivation to improve readability and performance of this project? - when the replacement is tested free open source enterprise grade maintained code with a lot of bindings?

I have some experience using both binary and SQL databases for chess. I knew SQL has some drawbacks as well as some advantages but IMO, the drawbacks are not serious, especially for nowadays computers, storage, the Internet speed. Thus it could play the role of an open chess database standard, where people could exchange data.

If you all don’t mind, I or someone may create a new topic to build that standard, focusing on SQL databases. I hope even some apps may not fully work with but support by exporting, importing to that kind of database.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: SCID4 database

Post by dangi12012 »

phhnguyen wrote: Wed Oct 13, 2021 6:49 am Sorry for jumping into your posts. Some people requested me to support SCID databases thus it is on my to-do list for a while - your topic will help me in the coming time. Thanks!

Since any binary data is always hard to understand and support by other apps. The situation is worse since we have many different formats for chess databases. I wish we have an open standard for chess game databases thus we could exchange data easily between chess GUIs and other applications.
Fulvio wrote: Wed Oct 06, 2021 12:14 pm And the main reason why a classic SQL database is not used is to obtain better performance.
dangi12012 wrote: Thu Oct 07, 2021 8:01 pm This looks a lot like what would normally be inside list of pgn files and you need the 4 relational tables from that.
Why not have a normal relational database in sqlite?

The language bindings are there for C++. C#, Java and every Language you can think of. That way any language could read the project file and interact with it.

As you said 20 years ago - there was little notion of open source and every developer had to come up with something on his own.
IMO sqlite has performance - is tested in enterprise software - and has good example code available?

When size is an issue you cann still store compressed blobs. Either way performance will be very good.

What would be your motivation to improve readability and performance of this project? - when the replacement is tested free open source enterprise grade maintained code with a lot of bindings?

I have some experience using both binary and SQL databases for chess. I knew SQL has some drawbacks as well as some advantages but IMO, the drawbacks are not serious, especially for nowadays computers, storage, the Internet speed. Thus it could play the role of an open chess database standard, where people could exchange data.

If you all don’t mind, I or someone may create a new topic to build that standard, focusing on SQL databases. I hope even some apps may not fully work with but support by exporting, importing to that kind of database.
The beautiful thing about SQL would be that you can use any backend. SQLite, MS SQL, Mariadb, mySQL etc..
We would only need a standard for the table layout.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: SCID4 database

Post by Fulvio »

glav wrote: Tue Oct 12, 2021 2:48 pm Instead, one thing that could really make the difference in the present scenario for scid users is deep position analysis, say a la IDEA in the Aquarium package.
Hi Giovanni,
it is a feature that I do not know.
If I understand it correctly, it starts from a position and the program automatically advances analyzing the most interesting variations.
I guess the user sets a time/depth limit and how many alternative variations to analyze.
It can be interesting.
Can I ask you to open a ticket on the site? ( https://sourceforge.net/p/scid/feature-requests/ )
That way it is tracked and if you can describe as precisely as possible how it works (or insert some links where it is explained) it would be of great help.
Fulvio
Posts: 396
Joined: Fri Aug 12, 2016 8:43 pm

Re: SCID4 database

Post by Fulvio »

phhnguyen wrote: Wed Oct 13, 2021 6:49 am Some people requested me to support SCID databases thus it is on my to-do list for a while - your topic will help me in the coming time. Thanks!
What license does Banksia have? If it is compatible you can do as chessx and use the SCID code directly:
https://github.com/Isarhamster/chessx/t ... r/dep/scid
phhnguyen wrote: Wed Oct 13, 2021 6:49 am I wish we have an open standard for chess game databases thus we could exchange data easily between chess GUIs and other applications.
In my opinion the SQL (a readable format used for exchanging data) for chess databases is PGN.
phhnguyen wrote: Wed Oct 13, 2021 6:49 am If you all don’t mind, I or someone may create a new topic to build that standard, focusing on SQL databases. I hope even some apps may not fully work with but support by exporting, importing to that kind of database.
Please, create a new topic.
Honestly it's a discussion I've had a lot of times already.
Even just looking at this thread: 3 software have been mentioned (SCID, lichess and Tarrasch Chess GUI) and none use SQL.
However, there is always someone who comes up with the final solution: use a SQL database!
And if you ask for something real that can back up their claims the reply is "the 10 minutes it takes to define two tables and try it out yourself".
As if we were all idiots and hadn't already tried: I even posted the link to a program that used sqlite and the code I tried rocksdb with.
So I personally will only be interested in a chess SQL database when someone provides verifiable evidence with similar performance to the video:
- a database with over 6 million games
- move e4: position search, stats calculated on the fly (for each next move the number of games won/drawn/ lost) and the list of top games sortable by column
- move e5: position search, stats calculated on the fly (for each next move the number of games won/drawn/ lost) and the list of top games sortable by column
etc ...
User avatar
phhnguyen
Posts: 1524
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: SCID4 database

Post by phhnguyen »

Fulvio wrote: Wed Oct 13, 2021 11:22 am
phhnguyen wrote: Wed Oct 13, 2021 6:49 am Some people requested me to support SCID databases thus it is on my to-do list for a while - your topic will help me in the coming time. Thanks!
What license does Banksia have? If it is compatible you can do as chessx and use the SCID code directly:
https://github.com/Isarhamster/chessx/t ... r/dep/scid
Thank you. However, I can't use those codes. On one hand, even I will publish my code soon, it is still a closed one at the moment thus I can't use any restricted opensource. On the other hand, I have my own data structures and many functions should link, work with any new data structures. That requires me to understand deeply. When I understand, create a new code is not hard (but the hard thing is to start ;) ).
Fulvio wrote: Wed Oct 13, 2021 11:22 am
phhnguyen wrote: Wed Oct 13, 2021 6:49 am I wish we have an open standard for chess game databases thus we could exchange data easily between chess GUIs and other applications.
In my opinion the SQL (a readable format used for exchanging data) for chess databases is PGN.
I totally agreed. However, PGN is just a baseline with many drawbacks we all have known, say, too slow and large to process, redundant, almost no structure/link between data, very hard to do any search even by names, date, results... If we are stuck on it forever, we will ignore many benefits from new and better hardware, software.
Fulvio wrote: Wed Oct 13, 2021 11:22 am
phhnguyen wrote: Wed Oct 13, 2021 6:49 am If you all don’t mind, I or someone may create a new topic to build that standard, focusing on SQL databases. I hope even some apps may not fully work with but support by exporting, importing to that kind of database.
Please, create a new topic.
Honestly it's a discussion I've had a lot of times already.
Even just looking at this thread: 3 software have been mentioned (SCID, lichess and Tarrasch Chess GUI) and none use SQL.
However, there is always someone who comes up with the final solution: use a SQL database!
And if you ask for something real that can back up their claims the reply is "the 10 minutes it takes to define two tables and try it out yourself".
As if we were all idiots and hadn't already tried: I even posted the link to a program that used sqlite and the code I tried rocksdb with.
So I personally will only be interested in a chess SQL database when someone provides verifiable evidence with similar performance to the video:
- a database with over 6 million games
I think I can take that as a challenge I can solve. Give me a few days (perhaps 1 week since I still have some other tasks to do) I will come back with a source code thus we all can discuss/continue.
Fulvio wrote: Wed Oct 13, 2021 11:22 am - move e4: position search, stats calculated on the fly (for each next move the number of games won/drawn/ lost) and the list of top games sortable by column
- move e5: position search, stats calculated on the fly (for each next move the number of games won/drawn/ lost) and the list of top games sortable by column
etc ...
IMO, searching games is always a huge challenge with any kind of database, from text, binary to SQL formats. I think we can solve it by having a high processing speed and/or a redundant index system. We can discuss and find solutions for searching later anyway.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
Rein Halbersma
Posts: 749
Joined: Tue May 22, 2007 11:13 am

Re: SCID4 database

Post by Rein Halbersma »

dangi12012 wrote: Mon Oct 11, 2021 10:38 pm As I told you - with SQL you get O(1) lookup performance on primary keys. You dont have to take it from me - literally anyone working in IT industry with some SQL experience can verify.
Lookup on indices is O(log N), not O(1).
dangi12012
Posts: 1062
Joined: Tue Apr 28, 2020 10:03 pm
Full name: Daniel Infuehr

Re: SCID4 database

Post by dangi12012 »

Rein Halbersma wrote: Wed Oct 13, 2021 2:04 pm
dangi12012 wrote: Mon Oct 11, 2021 10:38 pm As I told you - with SQL you get O(1) lookup performance on primary keys. You dont have to take it from me - literally anyone working in IT industry with some SQL experience can verify.
Lookup on indices is O(log N), not O(1).
100% Wrong.
Worlds-fastest-Bitboard-Chess-Movegenerator
Daniel Inführ - Software Developer
R. Tomasi
Posts: 307
Joined: Wed Sep 01, 2021 4:08 pm
Location: Germany
Full name: Roland Tomasi

Re: SCID4 database

Post by R. Tomasi »

dangi12012 wrote: Wed Oct 13, 2021 4:30 pm
Rein Halbersma wrote: Wed Oct 13, 2021 2:04 pm
dangi12012 wrote: Mon Oct 11, 2021 10:38 pm As I told you - with SQL you get O(1) lookup performance on primary keys. You dont have to take it from me - literally anyone working in IT industry with some SQL experience can verify.
Lookup on indices is O(log N), not O(1).
100% Wrong.
https://medium.com/@JasonWyatt/squeezin ... e175f3c346