Hello,
I was wondering what would be the best way to find a specific position in a large database (from a FEN for example).
I noticed that for example in Chessbase, it is not instant at all when you look for the games that match a specific position.
However, they have a statistics that show the percentages reached in the current position and that is instant.
I am also wondering why. (maybe they don't count positions reached from other moves combinations?)
Thanks,
Fastest way to find games that match a position in a large database
Moderators: hgm, Rebel, chrisw
-
- Posts: 72
- Joined: Sun Nov 01, 2020 5:27 pm
- Full name: Richard Porti
-
- Posts: 11585
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
Re: Fastest way to find games that match a position in a large database
Here's some "get you started" information: https://www.google.com/search?q=algorit ... ase+tables
In order to make database searches quick, databases are generally indexed. The next question then becomes, "What are the best indexes to generate for a database of chess positions?"
Some ideas (just off the top of my head):
* number of pieces
* presence of particular pieces
* players
etc.
If you're building your own database, you can index on your favourite position features - rooks on open files, or whatever it happens to be.
If you're a chess database company like Chessbase, you're probably going to spend a lot of time considering how chess players are going to want to search your database, and choose the indexes accordingly.
If you're only building a small database, you might consider just looking at every position and comparing against the query.
The possibilities are endless...
In order to make database searches quick, databases are generally indexed. The next question then becomes, "What are the best indexes to generate for a database of chess positions?"
Some ideas (just off the top of my head):
* number of pieces
* presence of particular pieces
* players
etc.
If you're building your own database, you can index on your favourite position features - rooks on open files, or whatever it happens to be.
If you're a chess database company like Chessbase, you're probably going to spend a lot of time considering how chess players are going to want to search your database, and choose the indexes accordingly.
If you're only building a small database, you might consider just looking at every position and comparing against the query.
The possibilities are endless...
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
-
- Posts: 11585
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
Re: Fastest way to find games that match a position in a large database
This is just a guess, but maybe the second question can be answered from the indexes alone, whereas to find the actual position, you'd first look in the indexes, and these might return thousands of possible matches, each of which would then have to be retrieved and looked at to see whether it's an exact match.
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
-
- Posts: 395
- Joined: Fri Aug 12, 2016 8:43 pm
Re: Fastest way to find games that match a position in a large database
The statistics are probably pre-calculated and stored into an hash table.Peperoni wrote: ↑Mon May 31, 2021 4:22 pm I was wondering what would be the best way to find a specific position in a large database (from a FEN for example).
I noticed that for example in Chessbase, it is not instant at all when you look for the games that match a specific position.
However, they have a statistics that show the percentages reached in the current position and that is instant.
I am also wondering why. (maybe they don't count positions reached from other moves combinations?)
Let's take an extreme example. A database with 1 million games of just 1 move: e4 or d4.
The hash table will contain only 2 entries and would be very fast to search into it.
On the contrary to obtain the list of matching games all the 1 million games need to be searched.
-
- Posts: 389
- Joined: Wed Sep 26, 2012 1:29 pm
- Location: Hungary
-
- Posts: 334
- Joined: Sat Feb 25, 2012 10:42 pm
- Location: Stockholm
Re: Fastest way to find games that match a position in a large database
I gave an idea of how to make a fast database in the following thread http://talkchess.com/forum3/viewtopic.php?f=7&t=75698towforce wrote: ↑Mon May 31, 2021 4:59 pm Here's some "get you started" information: https://www.google.com/search?q=algorit ... ase+tables
In order to make database searches quick, databases are generally indexed. The next question then becomes, "What are the best indexes to generate for a database of chess positions?"
Some ideas (just off the top of my head):
* number of pieces
* presence of particular pieces
* players
etc.
If you're building your own database, you can index on your favourite position features - rooks on open files, or whatever it happens to be.
If you're a chess database company like Chessbase, you're probably going to spend a lot of time considering how chess players are going to want to search your database, and choose the indexes accordingly.
If you're only building a small database, you might consider just looking at every position and comparing against the query.
The possibilities are endless...
My solution is compact and very fast (O(log n)) on both exact matches and partial matches.
If you also want to store games, you will have to add an additional table “games”, pointing to the “positions” table. I guess it would be smart making the “positions” table consist of a combined (game_id(bigint/identity/not null), position(see previous post for how to encode this)) as primary key and position also as foreign key to the games table. You will probably want to make it clustered on (position, game_id) since you want to get the different games fast for each position. The next problem will be to follow through a particular game without using too much space. One way of solving it is for a given position generate all the moves and generate new positions and for each position see if the game_id is there. This will also take O(log n) in time for each next continuation move in the game.
You will probably want to add additional information about the games in a third table, fourth … table where you store the opponents’ names, year of the match, where it took place and additional info.
-
- Posts: 11585
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
Re: Fastest way to find games that match a position in a large database
Pio wrote: ↑Tue Jun 01, 2021 1:07 amI gave an idea of how to make a fast database in the following thread http://talkchess.com/forum3/viewtopic.php?f=7&t=75698rth … table where you store the opponents’ names, year of the match, where it took place and additional info.
Interesting! Peperoni wants to write his own database from scratch. Nothing wrong with that, but I would personally use existing database software (link).
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
-
- Posts: 4367
- Joined: Fri Mar 10, 2006 5:23 am
- Location: http://www.arasanchess.org
Re: Fastest way to find games that match a position in a large database
ChessBase does have indexes, they called these "Search Boosters." See https://en.chessbase.com/post/chessbase ... -in-action.
Btw. it helps enormously to have your ChessBase databases on a SSD, preferably an M.2 SSD.
Btw. it helps enormously to have your ChessBase databases on a SSD, preferably an M.2 SSD.
-
- Posts: 11585
- Joined: Thu Mar 09, 2006 12:57 am
- Location: Birmingham UK
Re: Fastest way to find games that match a position in a large database
jdart wrote: ↑Thu Jun 03, 2021 4:46 pm ChessBase does have indexes, they called these "Search Boosters." See https://en.chessbase.com/post/chessbase ... -in-action.
Btw. it helps enormously to have your ChessBase databases on a SSD, preferably an M.2 SSD.
I must remember to stop using the word "index" and instead use "search booster"!
Nice video, though, showing some of Chessbase's capabilities. There's no article text to speak of - just go straight to the article video below (6:46):
Writing is the antidote to confusion.
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
It's not "how smart you are", it's "how are you smart".
Your brain doesn't work the way you want, so train it!
-
- Posts: 41
- Joined: Tue Oct 29, 2019 8:33 pm
- Location: French Polynesia
- Full name: Roger C.
Re: Fastest way to find games that match a position in a large database
Hi,Peperoni wrote: ↑Mon May 31, 2021 4:22 pm Hello,
I was wondering what would be the best way to find a specific position in a large database (from a FEN for example).
I noticed that for example in Chessbase, it is not instant at all when you look for the games that match a specific position.
However, they have a statistics that show the percentages reached in the current position and that is instant.
I am also wondering why. (maybe they don't count positions reached from other moves combinations?)
Thanks,
Try this online database : clic on "input FEN", enter the FEN and "OK" : you will have the answer. More than 13 billions analysed positions sor far.
You won't have the games but you'll have the variants to play and their scores.
https://www.chessdb.cn/queryc_en/
Best regards.