In the project code there is a utility script that exports data into plain-text.Rebel wrote: ↑Sun Jul 28, 2019 10:04 amBy accident, can you offer those 3 billion in EPD with SF score and depth, or a util that converts your database to EPD?noobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm For those who want to probe my database locally or for other unspecified reasons, here is a full database snapshot of my book project as of today:
ftp://ftp.chessdb.cn/pub/chessdb/data-s ... 190728.tar
The database contains about 3 billion unique chess positions, mostly connected to startpos, analyzed by Stockfish with no less than 22 plies at terminal node and has a very wide multi-pv exploration, the scores been back-propagated using a weighted averaging function, also for most of the positions there is a special field(encoded as 'a0a0') marking known shortest distance of the position from startpos.
Using this database snapshot is as simple as putting the data files under your database folder and launch the server, yet still, I'd recommend you to use the online API and make feature requests if you need any, since it is getting updated constantly and I have no plans to make such kind of snapshots very frequently(while waiting for a contributor to make incremental snapshots possible).
This database snapshot is released into the public domain.
Database snapshot
Moderators: hgm, Rebel, chrisw
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Database snapshot
Ah, nice. A problem I've been having is that after all lines have been refuted to 0.00, say, 1.e4, 1.c4 and 1.Nf3 look the same, even though it's much harder for black to equalize against 1.e4, and the 0.00 lines on 1.e4 are considerably less than in the others, so if they get counter-refuted 1.e4 is much more likely to end with positive score for white than the others, it'd make sense for 1.e4 to show better score than the others in the starting position.noobpwnftw wrote: ↑Sun Jul 28, 2019 4:09 pmI have applied penalties to a 0.00 score in back-propagation, maybe that caused it.
The problem with penalties for a 0.00 score (like Houdini 6 does) is that it'd hide real unpenalized variations (in this case, if the penalized mainline for d4 is 0.04, but there's a line that is 0.03, the database would show as best line the one that goes to 0.00 but it's shown as 0.04, even though the line that goes to 0.03 is better.)
A solution for this is to reserve some scores for penalized variations (...like Houdini 6 does.) Say, scores from -0.10 to 0.10 always lead to a penalized 0.00 score. For the rest of positions, if black has the edge it has subtracted -0.10 from score, and if white does it has 0.10 added to score. That way if the mainline of 1.d4 leads to unpenalized 0.03, the database would give it a 0.13 score, and the user would know at a glance that the score comes from actual advantage and not a penalized draw.
My suggestion would be to add granularity to scores, showing sub 0.01 scores, that way you can reserve -0.009 to 0.009 for penalized draws, and subtract/add this to everything else. The advantage would be that you could also use this reserved part of scores to differentiate from lines that have the same back-propagated score to denote how hard/easy is for one side to maintain it (say a position 0.14 that is very sharp and that refuting it would lead to lower score could be shown as 0.140, while another that has many lines that reach 0.14 so you'd need to refute them all to reduce the score could be shown as 0.149, while the interval 0.140-0.149 would be used for intermediate cases.)
We still don't have an online chess tree resource where people can take a look at a database and find back-propagated scores of chess engines, with a snapshot like this hopefully it's possible.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
My weighted averaging function also factors in the number of good reply moves so that it can tell the difference between different 0.00 lines and apply penalties accordingly, also gradually degrade sharp scores to prevent the back-propagation becoming fully min-maxed.
Regardless of how you want to apply those back-propagation algorithms, what matters are all the scores of terminal nodes and the shape of the tree, which both take much effort to construct and now available.
Regardless of how you want to apply those back-propagation algorithms, what matters are all the scores of terminal nodes and the shape of the tree, which both take much effort to construct and now available.
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Database snapshot
Thanks. Do you have documented how do you do it somewhere? It might also be helpful for people to apply these weighted averages/penalties to their own databases, to get an accurate score at the root (along with correspondence chess players using these concepts to choose moves, when even engines give 0.00 scores to non-transposing lines).noobpwnftw wrote: ↑Sun Jul 28, 2019 6:35 pm My weighted averaging function also factors in the number of good reply moves so that it can tell the difference between different 0.00 lines and apply penalties accordingly, also gradually degrade sharp scores to prevent the back-propagation becoming fully min-maxed.
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 101
- Joined: Sun Nov 14, 2010 9:36 pm
- Location: U.S.
Re: Database snapshot
Nice work, Noob. Sounds like you've got 35-40 million games in that database. Any word on the composition, i.e. % human, % engine?
This work would be greatly improved with a custom GUI that included your evaluation data, the empirical WLD data (flawed though it may be) and information on who played moves first. But I'm sure you've already thought of that!
This work would be greatly improved with a custom GUI that included your evaluation data, the empirical WLD data (flawed though it may be) and information on who played moves first. But I'm sure you've already thought of that!
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
It is not built from any existing games, rather, it is built recursively by self-play, 100% engine.Nelson Hernandez wrote: ↑Sun Jul 28, 2019 11:56 pm Nice work, Noob. Sounds like you've got 35-40 million games in that database. Any word on the composition, i.e. % human, % engine?
This work would be greatly improved with a custom GUI that included your evaluation data, the empirical WLD data (flawed though it may be) and information on who played moves first. But I'm sure you've already thought of that!
Any GUI that is capable of probing my API would also become a data source providing information that needs to be explored, aside from automated process.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Database snapshot
Your sample probing code is very helpful, able to probe book moves using python.noobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm For those who want to probe my database locally or for other unspecified reasons, here is a full database snapshot of my book project as of today:
For others who are interested here is a python script. Need to install request via pip install requests.
Code: Select all
#!/usr/bin/env python3
"""
book_probe.py
Probe book moves in ChessDB
Requirements:
* Python 3
* Requests
pip install requests
"""
import requests
import json
import time
def probe_book(fen):
data = []
pieces = fen.split()[0]
turn = fen.split()[1]
castle = fen.split()[2]
ep = fen.split()[3]
hmvc = fen.split()[4]
fmvn = fen.split()[5]
sep = '%20'
base = 'https://www.chessdb.cn/cdb.php?action=queryall&json=1&board='
url = base + pieces + sep + turn + sep + castle + \
sep + ep + sep + str(hmvc) + sep + str(fmvn)
r = requests.get(url)
jdata = r.text
d = json.loads(jdata)
if d['status'] != 'unknown':
for n in d['moves']:
data.append([n['san'], int(n['score']), float(n['winrate'])])
else:
print('Position is not found!')
if len(data) > 0:
for d in data:
print('{:6s} {:<5d} {:0.2f}'.format(d[0], d[1], d[2]))
def main():
while True:
fen = input('enter fen? ')
probe_book(fen)
time.sleep(2)
if __name__ == '__main__':
main()
Sample run
Code: Select all
enter fen? rnbqkbnr/pppppppp/8/8/3P4/8/PPP1PPPP/RNBQKBNR b KQkq - 0 1
Nf6 -5 49.62
d5 -6 49.55
e6 -9 49.32
c6 -20 48.49
g6 -38 47.12
a6 -38 47.12
d6 -40 46.97
f5 -51 46.14
c5 -65 45.09
Nc6 -66 45.02
h6 -75 44.34
a5 -79 44.04
b6 -81 43.89
Na6 -100 42.48
b5 -106 42.04
Nh6 -107 41.96
h5 -118 41.15
e5 -128 40.42
f6 -155 38.47
g5 -227 33.45
enter fen?
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Database snapshot
I take there's no Chess GUI that currently exists that is able to probe the database (without knowing programming/how to use scripts)?
Your beliefs create your reality, so be careful what you wish for.
-
- Posts: 4833
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Database snapshot
Exe file to probe opening book moves in ChessDB. It will just ask for FEN.
https://drive.google.com/file/d/1_KM89V ... sp=sharing
https://drive.google.com/file/d/1_KM89V ... sp=sharing
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Database snapshot
Thanks, I downloaded your exe, ran it in some directory and pasted a fen. I got this error:
Traceback (most recent call last):
File "book_probe2.py", line 55, in <module>
File "book_probe2.py", line 50, in main
File "book_probe2.py", line 26, in probe_book
IndexError: list index out of range
[172] Failed to execute script book_probe2
Traceback (most recent call last):
File "book_probe2.py", line 55, in <module>
File "book_probe2.py", line 50, in main
File "book_probe2.py", line 26, in probe_book
IndexError: list index out of range
[172] Failed to execute script book_probe2
Your beliefs create your reality, so be careful what you wish for.