klx wrote: ↑Thu Jun 24, 2021 6:09 pm
Hi there, I have come up with a pretty revolutionary idea to
vastly reduce the size of a DTM endgame database.
DTM?
Here are the facts that lead to my discovery:
1. The vast majority of positions are won within a few number of moves. For example, from the
syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).
So you mean DTZ.
2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.
Don't be so sure. You are probably thinking of SF doing 20-ply searches in a fraction of a second, but those are fake 20-ply searches which are very much incomplete.
So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!
You mean like this:
https://github.com/syzygy1/tb/commit/22 ... 16d71bd137
I created that branch for marcelk who wanted to see how small DTZ could be made.
It saves some space but not a lot. I don't remember what cut-off ply he used, but it won't have been 20.
DTZ is anyway already small enough. 6-piece DTZ is just a bit larger than 6-piece WDL, and I believe 7-piece DTZ is already smaller than 7-piece WDL. For DTZ It is fine to use a cheap mechanical drive, since you need to probe it only at the root. Within the search you need quick access to WDL and replacing that by search is of course defeating the purpose of TBs (plus you'd still need to store some information which can be distinguished from win/loss, i.e. your TB is probably going to be bigger in size in addition to be nearly useless).
The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.
Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB
Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB
So you haven't really thought it through at all and probably miss the fundamental understanding to do so. I bet that you are thinking that TBs are stored as a list of FENs.
Come back when you have something that works.