Forget Syzygy -- Presenting the Emanuel Torresbase!

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

yurikvelo wrote: Fri Jun 25, 2021 8:24 am your special identifier consume the same amount of space as Syzygy identifier.
Not in the Emanuel Torresbase. There are two approaches here:
  1. Don't even store these easily won positions. We only store the "hard" positions as say index + DTM and use hash or binary search.
  2. Store the easily won positions with a special identifier and use RLE or LZ to compress. For example, instead of storing DTMs [7, 2, 1, 9, 5, 5, 17, 8, 26, 9, 1, 2, 4] we would store [0, 0, 0, 0, 0, 0, 0, 0, 26, 0, 0, 0, 0], where 26 is the only hard position, which could be compressed as [8x0, 26, 4x0].
[Moderation warning] This signature violated the rule against commercial exhortations.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

phhnguyen wrote: Fri Jun 25, 2021 9:54 am I think you have a wrong assumption here. Syzygy uses the metric DTZ50 - somewhat is kind of DTC (deep to conversion). It means when it says 20 plies for a given position, you need to search 20 plies to have a conversion (capture, push pawn, mate...). If it is not a mate, you may not know the result yet and your search function can't stop after that 20th ply but has to search beyond.
Oh, hmm, I have based my research on the Syzygy endgame stats page, the comments there say that the stats are "# of positions losing in 2" etc, not conversion. Emanuel Torresbase uses distance to mate, not conversion.
phhnguyen wrote: Fri Jun 25, 2021 9:54 am For some 6 men endgames, Nalimov EGTB has to use 2 bytes for its DTM. It means a search function has to work over depth 256 to complete. That is why I have said (above) that some endgames require hours or even longer of searching.
To me this means that Nalimov is inefficient. If we need to depth 512 then use 9 bits or better yet variable bit length, not 2 bytes.
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
yurikvelo
Posts: 710
Joined: Sat Dec 06, 2014 1:53 pm

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by yurikvelo »

klx wrote: Fri Jun 25, 2021 6:06 pm [*]Don't even store these easily won positions. We only store the "hard" positions as say index + DTM and use hash or binary search.
Engine will have no chance to reach that "won position":
- no TB hits
- no bonus for PV leading to won position
- no ELO gain due to EGTB support
klx wrote: Fri Jun 25, 2021 6:06 pm Store the easily won positions with a special identifier and use RLE or LZ to compress. For example, instead of storing DTMs [7, 2, 1, 9, 5, 5, 17, 8, 26, 9, 1, 2, 4] we would store [0, 0, 0, 0, 0, 0, 0, 0, 26, 0, 0, 0, 0], where 26 is the only hard position, which could be compressed as [8x0, 26, 4x0].
Syzygy store much less information. Instead of storing DTMs it stores only Win/Draw (1/0)
It is 7 times as small as Gaviota for 5 men, 8 times as small as Nalimov for 6 men, 8 times as small as Lomonosov for 7 men.
User avatar
j.t.
Posts: 239
Joined: Wed Jun 16, 2021 2:08 am
Location: Berlin
Full name: Jost Triller

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by j.t. »

klx wrote: Fri Jun 25, 2021 3:27 am We can do safe pruning. If you look at this page, the effective branching factor we can expect around 2, 2^20 = 1 million, and we should be able to analyze more than 1 million nodes per second.
If your branching factor is 2, you are very likely doing lots of unsafe pruning. It is even mentioned on the webpage you linked:
In Alpha-Beta, assuming good move ordering, the branching factor is reduced to about square root of the average branching factor
Assuming ~35 legal moves in each position, you get an average branching factor of 6. Which means around 3,000,000,000,000,000 leave nodes after 20 moves. Sure, there are methods that might improve on this, without unsafe pruning (transposition tables, PVS, ...) but I don't think you can get below a branching factor of 3 with only these (or at least not without new/seriously improved algorithmic methods). And this would already mean more than 3 billion nodes at depth 20.
AndrewGrant
Posts: 1750
Joined: Tue Apr 19, 2016 6:08 am
Location: U.S.A
Full name: Andrew Grant

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by AndrewGrant »

klx wrote: Fri Jun 25, 2021 3:27 am
AndrewGrant wrote: Fri Jun 25, 2021 1:04 am Engines can achieve depth 20 easily, but its heavily pruned. If you want a proven tree, your AlphaBeta is very limited in its depth without unsafe pruning decisions.
I respectfully disagree. We can do safe pruning. If you look at this page, the effective branching factor we can expect around 2, 2^20 = 1 million, and we should be able to analyze more than 1 million nodes per second.

The other thing is that we know this is endgame (<= 8 pieces), and a terminal position in <= 20 plies. So I would expect the branching factor to be even less.
Probably not worth getting into it, but just as my closing remarks I suppose:

Your link sites exactly what I said. You can get a BF=2, if you prune heavily, unsafely. Otherwise, you are capped by the limits of Alpha Beta, which are no where near BF=2. I find it odd that you hop in, propose something named after yourself, and then in a post later note that you only learned about Syzygy a week ago, yet think you can outclass it massively. I'm not saying its not possible, but the "proof of concept" as you called it was far from a proof of concept. If you think the idea has potential, you should explore it and have a working example on some subset of positions as your working model. I don't think you'll convince anyone else here without it.
#WeAreAllDraude #JusticeForDraude #RememberDraude #LeptirBigUltra
"Those who can't do, clone instead" - Eduard ( A real life friend, not this forum's Eduard )
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by Rebel »

klx wrote: Fri Jun 25, 2021 6:00 pm
Rebel wrote: Fri Jun 25, 2021 6:01 am Just start with KPK and post the size.
Alright with the Kolmogorov extension I can do KPK in 1248 bytes, with millisecond lookup time!
The Syzygy base uses 13Kb for white and black. So just 1248 bytes would be a huge progress.

And for the proof of your claim, please provide those 1248 bytes plus the source probe code.
90% of coding is debugging, the other 10% is writing bugs.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by hgm »

I am sure he can. But I am afraid that the performance would be similar to a tablebase of 0 bytes requiring 0 lines of probing code...

Of course you can get a somewhat satisfactory performance in KPK by just recognizing a few critical patterns (wins by the 'rule of squares', draws for the defending King in the path of the Pawn not more than 2 ranks in front of it, or 3 ranks with opposition), and search on (or default to a win in a leaf) in all other positions. That makes the search needed to recognize other drawn positions quite shallow. This is what I do in KingSlayer.
klx
Posts: 179
Joined: Tue Jun 15, 2021 8:11 pm
Full name: Emanuel Torres

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by klx »

AndrewGrant wrote: Sat Jun 26, 2021 11:30 am Your link sites exactly what I said. You can get a BF=2, if you prune heavily, unsafely. Otherwise, you are capped by the limits of Alpha Beta, which are no where near BF=2.
Hmm, this is new information. Listing EBFs with unsafe pruning is very odd if you ask me. I think we'll need to try it out and see, my gut feeling is that the transposition table will make wonders due to the few pieces on the board, and also since this is endgame there may be many forced moves, so we can keep the EBF very very low.
Rebel wrote: Sat Jun 26, 2021 1:04 pm The Syzygy base uses 13Kb for white and black. So just 1248 bytes would be a huge progress.

And for the proof of your claim, please provide those 1248 bytes plus the source probe code.
Thanks! Yes it's the not the <1% I was shooting for but <10% is certainly something. For this particular table, it turns out there was a large number of draws, so the main Emanuel Torresbase methodology would not work well, instead I used the Kolmogorov extension, it works by taking the source code for the generator (I used the "Pretty fast KPK endgame table generator") and compressing it (WinZip). The probing code would then generate the table (1 millisecond) and do a memory lookup (instant). The zipped source code is 1248 bytes and is all that is needed.

While it's a slightly different approach, I think this proves that by just thinking outside the box, vast betterments can be achieved.
[Moderation warning] This signature violated the rule against commercial exhortations.
User avatar
Rebel
Posts: 6991
Joined: Thu Aug 18, 2011 12:04 pm

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by Rebel »

klx wrote: Sat Jun 26, 2021 5:18 pm
AndrewGrant wrote: Sat Jun 26, 2021 11:30 am Your link sites exactly what I said. You can get a BF=2, if you prune heavily, unsafely. Otherwise, you are capped by the limits of Alpha Beta, which are no where near BF=2.
Hmm, this is new information. Listing EBFs with unsafe pruning is very odd if you ask me. I think we'll need to try it out and see, my gut feeling is that the transposition table will make wonders due to the few pieces on the board, and also since this is endgame there may be many forced moves, so we can keep the EBF very very low.
Rebel wrote: Sat Jun 26, 2021 1:04 pm The Syzygy base uses 13Kb for white and black. So just 1248 bytes would be a huge progress.

And for the proof of your claim, please provide those 1248 bytes plus the source probe code.
Thanks! Yes it's the not the <1% I was shooting for but <10% is certainly something. For this particular table, it turns out there was a large number of draws, so the main Emanuel Torresbase methodology would not work well, instead I used the Kolmogorov extension, it works by taking the source code for the generator (I used the "Pretty fast KPK endgame table generator") and compressing it (WinZip). The probing code would then generate the table (1 millisecond) and do a memory lookup (instant). The zipped source code is 1248 bytes and is all that is needed.

While it's a slightly different approach, I think this proves that by just thinking outside the box, vast betterments can be achieved.
Can you share the source code?
90% of coding is debugging, the other 10% is writing bugs.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Forget Syzygy -- Presenting the Emanuel Torresbase!

Post by syzygy »

klx wrote: Thu Jun 24, 2021 6:09 pm Hi there, I have come up with a pretty revolutionary idea to vastly reduce the size of a DTM endgame database.
DTM?
Here are the facts that lead to my discovery:

1. The vast majority of positions are won within a few number of moves. For example, from the syzygy stats site it seems that towards 99% of positions are won in less than 20 plies (depending the table, some are a lot more like 99.95%).
So you mean DTZ.
2. We can trivially search to depth 20 plies for endgames with alpha-beta in a fraction of a second.
Don't be so sure. You are probably thinking of SF doing 20-ply searches in a fraction of a second, but those are fake 20-ply searches which are very much incomplete.
So, in the Emanuel Torresbase, we store a special identifier instead of the actual DTM for these "easily-won/lost" positions. During query of a position, if this special value is found, we do alpha-beta to find the outcome. In other words, we can cut out up to 99.95% of the table!
You mean like this: https://github.com/syzygy1/tb/commit/22 ... 16d71bd137
I created that branch for marcelk who wanted to see how small DTZ could be made.

It saves some space but not a lot. I don't remember what cut-off ply he used, but it won't have been 20.

DTZ is anyway already small enough. 6-piece DTZ is just a bit larger than 6-piece WDL, and I believe 7-piece DTZ is already smaller than 7-piece WDL. For DTZ It is fine to use a cheap mechanical drive, since you need to probe it only at the root. Within the search you need quick access to WDL and replacing that by search is of course defeating the purpose of TBs (plus you'd still need to store some information which can be distinguished from win/loss, i.e. your TB is probably going to be bigger in size in addition to be nearly useless).
The Emanuel Torresbase reduces the size of existing databases and paves the way for 8-men database.

Syzygy 7 men: 16.7 TiB
Minus 99.95%: 8.5 GiB

Estimated 8 men size: 8.5 GiB * (16.7 TiB / 149.2 GiB) = 974 GiB
So you haven't really thought it through at all and probably miss the fundamental understanding to do so. I bet that you are thinking that TBs are stored as a list of FENs.

Come back when you have something that works.