For those who want to probe my database locally or for other unspecified reasons, here is a full database snapshot of my book project as of today:
ftp://ftp.chessdb.cn/pub/chessdb/data-s ... 190728.tar
The database contains about 3 billion unique chess positions, mostly connected to startpos, analyzed by Stockfish with no less than 22 plies at terminal node and has a very wide multi-pv exploration, the scores been back-propagated using a weighted averaging function, also for most of the positions there is a special field(encoded as 'a0a0') marking known shortest distance of the position from startpos.
Using this database snapshot is as simple as putting the data files under your database folder and launch the server, yet still, I'd recommend you to use the online API and make feature requests if you need any, since it is getting updated constantly and I have no plans to make such kind of snapshots very frequently(while waiting for a contributor to make incremental snapshots possible).
This database snapshot is released into the public domain.
Database snapshot
Moderators: hgm, Rebel, chrisw
-
- Posts: 12554
- Joined: Wed Mar 08, 2006 8:57 pm
- Location: Redmond, WA USA
Re: Database snapshot
Please leave it online for a while, i an on vacation and cannot download it right nownoobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm For those who want to probe my database locally or for other unspecified reasons, here is a full database snapshot of my book project as of today:
ftp://ftp.chessdb.cn/pub/chessdb/data-s ... 190728.tar
The database contains about 3 billion unique chess positions, mostly connected to startpos, analyzed by Stockfish with no less than 22 plies at terminal node and has a very wide multi-pv exploration, the scores been back-propagated using a weighted averaging function, also for most of the positions there is a special field(encoded as 'a0a0') marking known shortest distance of the position from startpos.
Using this database snapshot is as simple as putting the data files under your database folder and launch the server, yet still, I'd recommend you to use the online API and make feature requests if you need any, since it is getting updated constantly and I have no plans to make such kind of snapshots very frequently(while waiting for a contributor to make incremental snapshots possible).
This database snapshot is released into the public domain.
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Database snapshot
Thanks for sharing.
I tried to probe from startpos with the following result.
[d]rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
In
What is (8)?
Why rank 2 and not rank 1?
What is (20-04)?
For other position there is no (value) under Score column.
[d]rnbqk2r/pppnbppp/4p3/3pP1B1/3P3P/2N5/PPP2PP1/R2QKBNR b KQkq - 0 6
I tried to probe from startpos with the following result.
[d]rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1
Code: Select all
Move Score Rank Note winrate%
0 e2e4 15 (8) 2 ! (20-04) 50.61
1 d2d4 15 (4) 2 ! (20-03) 50.30
2 g1f3 15 (2) 2 ! (20-04) 50.15
3 g2g3 10 (2) 2 ! (20-07) 50.15
4 c2c4 10 (2) 2 ! (20-04) 50.15
5 d2d3 0 1 * (20-12) 50.00
6 c2c3 0 1 * (20-08) 50.00
7 e2e3 0 1 * (20-10) 50.00
8 b2b3 0 1 * (20-10) 50.00
9 b1c3 0 1 * (20-04) 50.00
10 a2a3 0 1 * (20-09) 50.00
11 h2h3 -1 1 * (20-09) 49.92
12 f2f4 -4 0 ? (20-14) 49.70
13 a2a4 -5 0 ? (20-11) 49.62
14 b2b4 -6 0 ? (20-11) 49.55
15 g1h3 -41 0 ? (20-05) 46.90
16 b1a3 -51 0 ? (20-01) 46.14
17 h2h4 -57 0 ? (20-01) 45.69
18 f2f3 -82 0 ? (20-01) 43.82
19 g2g4 -103 0 ? (20-01) 42.26
Code: Select all
0 e2e4 15 (8) 2 ! (20-04) 50.61
Why rank 2 and not rank 1?
What is (20-04)?
For other position there is no (value) under Score column.
[d]rnbqk2r/pppnbppp/4p3/3pP1B1/3P3P/2N5/PPP2PP1/R2QKBNR b KQkq - 0 6
Code: Select all
Move Score Rank Note winrate%
0 e7g5 -31 2 ! (07-01) 47.65
1 h7h6 -52 0 ? (06-01) 46.07
2 e8g8 -68 0 ? (13-01) 44.87
3 c7c5 -77 0 ? (05-01) 44.19
4 b8c6 -78 0 ? (03-01) 44.12
5 a7a6 -89 0 ? (08-01) 43.30
6 c7c6 -107 0 ? (05-01) 41.96
7 f7f6 -121 0 ? (05-01) 40.93
8 b7b6 -123 0 ? (14-01) 40.79
9 d7b6 -126 0 ? (17-01) 40.57
10 d7f8 -161 0 ? (19-02) 38.04
11 g7g6 -170 0 ? (18-01) 37.40
12 f7f5 -208 0 ? (09-01) 34.74
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
Code: Select all
e2e4 15 (8) 2 ! (20-04) 50.61
<Notation of the move> <adjusted score>(<real score>) <rank> <rank mark> (<# of known reply moves>-<# of good reply moves>) <winrate>
For rank, 2 > 1 > 0 where rank=2 means it is a preferred move, rank=1 means it is a good alternative, rank=0 means it's a bad move(also when the position itself is bad).
Adjusted score only applies to startpos, mainly to normalize the above calculations.
Score has a range of +-10000, more than that it means a known mate score, with mated score at +-30000.
All these calculations are done at API front-end, the raw database just maps position keys to a set of moves which then maps to their eval score.
Position keys are binary-encoded FEN format with white-black symmetry(using the smaller one in their hex string form).
-
- Posts: 4840
- Joined: Sun Aug 10, 2008 3:15 pm
- Location: Philippines
Re: Database snapshot
Thanks got it.noobpwnftw wrote: ↑Sun Jul 28, 2019 4:57 amThis reads:Code: Select all
e2e4 15 (8) 2 ! (20-04) 50.61
<Notation of the move> <adjusted score>(<real score>) <rank> <rank mark> (<# of known reply moves>-<# of good reply moves>) <winrate>
For rank, 2 > 1 > 0 where rank=2 means it is a preferred move, rank=1 means it is a good alternative, rank=0 means it's a bad move(also when the position itself is bad).
Adjusted score only applies to startpos, mainly to normalize the above calculations.
Score has a range of +-10000, more than that it means a known mate score, with mated score at +-30000.
All these calculations are done at API front-end, the raw database just maps position keys to a set of moves which then maps to their eval score.
Position keys are binary-encoded FEN format with white-black symmetry(using the smaller one in their hex string form).
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
Binary FEN encoding has the following format:
Where each board unit has a 8-bit value of:
0 = 1 empty space
1 = 2 empty spaces
2 = 3 empty spaces
3 = p
4 = n
5 = b
6 = r
7 = q
8 = unused to avoid ambiguity
9 = k
a = P
b = N
c = B
d = R
e = Q
f = K
Turn is a 1 bit flag of 0 = white, 1 = black.
Special unit representing castling and ep information has a 8-bit value of:
0 = none
1 = a
2 = b
3 = c
4 = d
5 = e
6 = f
7 = g
8 = h
9 = delimiter
a = K
b = Q
c = k
d = q
and the file of ep square is as-is of it's numeric value.
Then output is then tailing-zero trimmed to produce the final position key.
Internally, moves are encoded as 16-bit values:
Where if promotion flag is set, dst_rank is redefined as:
0 = q
1 = r
2 = b
3 = n
Code: Select all
<board unit>...<board unit><turn><special unit>...<special unit>
0 = 1 empty space
1 = 2 empty spaces
2 = 3 empty spaces
3 = p
4 = n
5 = b
6 = r
7 = q
8 = unused to avoid ambiguity
9 = k
a = P
b = N
c = B
d = R
e = Q
f = K
Turn is a 1 bit flag of 0 = white, 1 = black.
Special unit representing castling and ep information has a 8-bit value of:
0 = none
1 = a
2 = b
3 = c
4 = d
5 = e
6 = f
7 = g
8 = h
9 = delimiter
a = K
b = Q
c = k
d = q
and the file of ep square is as-is of it's numeric value.
Then output is then tailing-zero trimmed to produce the final position key.
Internally, moves are encoded as 16-bit values:
Code: Select all
<4-bit src_rank><1-bit promotion flag><3-bit src_file><4-bit dst_rank><4-bit dst_file>
0 = q
1 = r
2 = b
3 = n
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
In board unit above, if there are more than 3 empty spaces, the first unit is set to 8 and the next unit is the number of empty spaces minus 4.
And correction: turn is a 8-bit flag, instead of 1.
And correction: turn is a 8-bit flag, instead of 1.
-
- Posts: 7025
- Joined: Thu Aug 18, 2011 12:04 pm
- Full name: Ed Schröder
Re: Database snapshot
By accident, can you offer those 3 billion in EPD with SF score and depth, or a util that converts your database to EPD?noobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm For those who want to probe my database locally or for other unspecified reasons, here is a full database snapshot of my book project as of today:
ftp://ftp.chessdb.cn/pub/chessdb/data-s ... 190728.tar
The database contains about 3 billion unique chess positions, mostly connected to startpos, analyzed by Stockfish with no less than 22 plies at terminal node and has a very wide multi-pv exploration, the scores been back-propagated using a weighted averaging function, also for most of the positions there is a special field(encoded as 'a0a0') marking known shortest distance of the position from startpos.
Using this database snapshot is as simple as putting the data files under your database folder and launch the server, yet still, I'd recommend you to use the online API and make feature requests if you need any, since it is getting updated constantly and I have no plans to make such kind of snapshots very frequently(while waiting for a contributor to make incremental snapshots possible).
This database snapshot is released into the public domain.
90% of coding is debugging, the other 10% is writing bugs.
-
- Posts: 4556
- Joined: Tue Jul 03, 2007 4:30 am
Re: Database snapshot
Interesting, my private database uses depth 22 as well, looks like we found it to be optimal (depth 21 having considerably less quality, depth 23 being consirerably more slow) independently?noobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm analyzed by Stockfish with no less than 22 plies at terminal node
Surprising to see scores that high. Mine has everything at 0.00 except for 1.d4 which is 0.03 (all white tries have been refuted to a 0.00 score otherwise).Ferdy wrote: ↑Sun Jul 28, 2019 4:31 amI tried to probe from startpos with the following result.
[d]rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1Code: Select all
Move Score Rank Note winrate% 0 e2e4 15 (8) 2 ! (20-04) 50.61 1 d2d4 15 (4) 2 ! (20-03) 50.30 2 g1f3 15 (2) 2 ! (20-04) 50.15 3 g2g3 10 (2) 2 ! (20-07) 50.15 4 c2c4 10 (2) 2 ! (20-04) 50.15 5 d2d3 0 1 * (20-12) 50.00 6 c2c3 0 1 * (20-08) 50.00 7 e2e3 0 1 * (20-10) 50.00 8 b2b3 0 1 * (20-10) 50.00 9 b1c3 0 1 * (20-04) 50.00 10 a2a3 0 1 * (20-09) 50.00 11 h2h3 -1 1 * (20-09) 49.92 12 f2f4 -4 0 ? (20-14) 49.70 13 a2a4 -5 0 ? (20-11) 49.62 14 b2b4 -6 0 ? (20-11) 49.55 15 g1h3 -41 0 ? (20-05) 46.90 16 b1a3 -51 0 ? (20-01) 46.14 17 h2h4 -57 0 ? (20-01) 45.69 18 f2f3 -82 0 ? (20-01) 43.82 19 g2g4 -103 0 ? (20-01) 42.26
...
Oh, three billion means your database is 1000 times larger than mine
I'd wish for a way to check it online (see https://www.365chess.com/opening.php for an example)
-
- Posts: 560
- Joined: Sun Nov 08, 2015 11:10 pm
Re: Database snapshot
Depth 22 seems to be a good balance between quality and speed.Ovyron wrote: ↑Sun Jul 28, 2019 12:22 pmInteresting, my private database uses depth 22 as well, looks like we found it to be optimal (depth 21 having considerably less quality, depth 23 being consirerably more slow) independently?noobpwnftw wrote: ↑Sat Jul 27, 2019 11:54 pm analyzed by Stockfish with no less than 22 plies at terminal node
Surprising to see scores that high. Mine has everything at 0.00 except for 1.d4 which is 0.03 (all white tries have been refuted to a 0.00 score otherwise).Ferdy wrote: ↑Sun Jul 28, 2019 4:31 amI tried to probe from startpos with the following result.
[d]rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1Code: Select all
Move Score Rank Note winrate% 0 e2e4 15 (8) 2 ! (20-04) 50.61 1 d2d4 15 (4) 2 ! (20-03) 50.30 2 g1f3 15 (2) 2 ! (20-04) 50.15 3 g2g3 10 (2) 2 ! (20-07) 50.15 4 c2c4 10 (2) 2 ! (20-04) 50.15 5 d2d3 0 1 * (20-12) 50.00 6 c2c3 0 1 * (20-08) 50.00 7 e2e3 0 1 * (20-10) 50.00 8 b2b3 0 1 * (20-10) 50.00 9 b1c3 0 1 * (20-04) 50.00 10 a2a3 0 1 * (20-09) 50.00 11 h2h3 -1 1 * (20-09) 49.92 12 f2f4 -4 0 ? (20-14) 49.70 13 a2a4 -5 0 ? (20-11) 49.62 14 b2b4 -6 0 ? (20-11) 49.55 15 g1h3 -41 0 ? (20-05) 46.90 16 b1a3 -51 0 ? (20-01) 46.14 17 h2h4 -57 0 ? (20-01) 45.69 18 f2f3 -82 0 ? (20-01) 43.82 19 g2g4 -103 0 ? (20-01) 42.26
...
Oh, three billion means your database is 1000 times larger than mine
I'd wish for a way to check it online (see https://www.365chess.com/opening.php for an example)
I have applied penalties to a 0.00 score in back-propagation, maybe that caused it.
For a nice GUI like those I someone would look up the data from my API so that no reinventing wheels is needed.