Felicity EGTB generators and probers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Attempt 15: calculate index spaces for more men for chess

Post by phhnguyen »

hgm wrote: Thu Jun 27, 2024 4:29 pm
phhnguyen wrote: Thu Jun 27, 2024 1:24 pmGenerating an EGTB using on-fly large data stored on hard disks may slow down 5-10 times, compared with one using RAM only. Instead of waiting for a few months, we may have to wait for years!
But is stead of hours, you might have to wait a day. It probably would not be a very good idea from an economic point of view to spend many tens of thousands dollars on memory so that you could finish the job in a day instead of in a week, never having to use the memory again.

It all depends on what your final goal is, in terms of gigabyte-hours. If you have to work for a year to earn the money to buy a computer that does it in a day instead of a month with the computer you already have, it in fact took you a year instead of a month...
I think that is the art of balance: we need to balance what we want with our money (to buy computers, maintain, to pay electric bills…). As I have mentioned, we could save a lot of money by buying old workstations with huge RAM. That may be better than waiting for years and paying huge electric bills or buying expensive computers. Or we may ask someone to help (wait for heroes ;) ). Or just wait until the next time to upgrade our computers with enough RAM.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

syzygy wrote: Thu Jun 13, 2024 11:41 pm Let's consider a 9-man ending with maximum symmetry such as KNNNNvKRRR. If we place the Ks togethers in 462 ways using the board's symmetry, then we have Bin(62,4) ways to place the NNNN and Bin(58,3) ways to place the RRR. With 1 byte per position, this would requires 2 * 462 * (62*61*60*59*58*57*56) / (24 * 6) bytes = 14.5 TB. Bourzutschky and Konoval can apparently at least do this ending with 1.5TB, so they do something "smarter".

And apparently they computed KQRBvKQRN, which would need 2 * 462 * 62*61*60*59*58*57 bytes = 37.2 TB.

I guess they don't store those tables, and they don't do pawns. The full 8-men set probably needs about 1 petabyte for compressed WDL alone. It is possible but not at all practical.

Code: Select all

  1)              knk                        36'096
  2)              krk                        36'096
  3)             krkn                     2'310'144
  4)             knnk                     1'137'024
  5)             krrk                     1'137'024
  6)            krknn                    72'769'536
  7)            krrkn                    72'769'536
  8)            knnnk                    23'498'496
  9)            krrrk                    23'498'496
 10)           krknnn                 1'503'903'744
 11)           krrknn                 2'292'240'384
 12)           krrrkn                 1'503'903'744
 13)           knnnnk                   358'352'064
 14)          krknnnn                22'934'532'096
 15)          krrknnn                47'372'967'936
 16)          krrrknn                47'372'967'936
 17)         krrknnnn               722'437'761'024
 18)         krrrknnn               979'041'337'344
 19)        krrrknnnn            14'930'380'394'496
My krrrknnnn has an index space of about 14.9T, quite close to yours. However, I need to use two arrays for two sides, 1 byte for each item, 14.9 x 2 x 1 byte = 29.8 TB
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

Rein Halbersma wrote: Fri Jun 14, 2024 4:34 pm
With the Wu & Beal algo, they only need 1 bit in RAM (doing lots of sequential I/O from and to disk), so they would need only 0.9 TB in RAM.
And apparently they computed KQRBvKQRN, which would need 2 * 462 * 62*61*60*59*58*57 bytes = 37.2 TB.
I think they can do 1 bishop color at a time which means 18.6 TB on disk for that endgame before compression.

Doing 1-bit in RAM and a lot of sequential I/O (Wu & Beal algo) they need 1.16 TB of RAM for the generation of that table.
I guess they don't store those tables, and they don't do pawns. The full 8-men set probably needs about 1 petabyte for compressed WDL alone. It is possible but not at all practical.
If they do the Wu & Beal 1-bit RAM algo, it should take a lot of I/O back and from disk. A 1 TB bitmap from a HDD @ 80 MB/s takes 3,5 hours. With an SSD @200 MB/s it still takes 1,5 hours. *Per iteration*, so times 400+, should take about a month per tablebase. Perhaps they have very fast SSDs and manage not to trash them with all the I/O.
Wu & Beal’s work is so ambiguous, IMHO. They said they used only one bit for each position but then they wrote they still have full information such as DTC/DTM which need at least 8 bits (1 byte). Perhaps they used 1 byte (8 bits) when generating but stored into only 1 bit on hard disks? If it was true, they still need huge RAM for generating!

BTW, suppose it is 1 bit, their EGTB used a bitbase metric in the hardest form (1 bit only vs 2 bits). The most disadvantage of bitbases is that for some endgames we cannot make the progress for searching. It is true for chess and more than true for Xiangqi, from my own experience! The only public bitbase EGTB so far is Scorpio, supported by a complicated library to help progress. Even Scorpio EGTB is much smaller than Syzygy one but so far the number of chess engines supporting is so small. Not sure why. Perhaps, the way to integrate its library is too complicated!?

I guess Wu & Beal bitbase EGTB may be the same: it wouldn’t be a practical one even if they or someone else invested enough time to create a highly sophisticated supporting library code and then publish it.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
Rein Halbersma
Posts: 748
Joined: Tue May 22, 2007 11:13 am

Re: Felicity EGTB generators and probers

Post by Rein Halbersma »

phhnguyen wrote: Sat Jun 29, 2024 12:21 pm Wu & Beal’s work is so ambiguous, IMHO. They said they used only one bit for each position but then they wrote they still have full information such as DTC/DTM which need at least 8 bits (1 byte). Perhaps they used 1 byte (8 bits) when generating but stored into only 1 bit on hard disks? If it was true, they still need huge RAM for generating!
The Wu & Beal paper is really clear: they repeatedly do IO on bitmaps from disk, and they also keep a 1-byte per position DTM file. But that file is never brought into RAM, but only sequentially updated from bitmaps with the latest iteration's information. Their final format is not a bitbase, but full DTM information. It's a variatoin on Thompson's algorithm (that algorithm even works with less memory, as he doesn't even brings a single bitmap fully into memory).
User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

Rein Halbersma wrote: Sat Jun 29, 2024 1:49 pm
phhnguyen wrote: Sat Jun 29, 2024 12:21 pm Wu & Beal’s work is so ambiguous, IMHO. They said they used only one bit for each position but then they wrote they still have full information such as DTC/DTM which need at least 8 bits (1 byte). Perhaps they used 1 byte (8 bits) when generating but stored into only 1 bit on hard disks? If it was true, they still need huge RAM for generating!
The Wu & Beal paper is really clear: they repeatedly do IO on bitmaps from disk, and they also keep a 1-byte per position DTM file. But that file is never brought into RAM, but only sequentially updated from bitmaps with the latest iteration's information. Their final format is not a bitbase, but full DTM information. It's a variatoin on Thompson's algorithm (that algorithm even works with less memory, as he doesn't even brings a single bitmap fully into memory).
I got it. Thanks for the clarification!

That means their EGTB was a “standard” DTM/DTC, 8 bits per item, stored completely on a hard disk when generating. The generator reads the whole data from the hard disk for each lap, which is very slow. The bit array plays the role of bit flags to do some tricks to speed up, say, reduce the number of readings but I guess still being slow (since seek-read is a very slow action).

Recently, I have been using a similar bit flag array, but 2 bits per item and all data is loaded into memory to gain maximum speed.

When working with huge endgames I may divide data by trunks which could fit into memory. I guess that may be faster.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
hgm
Posts: 27945
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Felicity EGTB generators and probers

Post by hgm »

The bitmap-based algorithm I have tried would generate a 'newly lost' bitmap in every iteration, which contains the positions with DTM equal to the iteration number. This can be saved on disk. (And for many iterations it would be very sparse, and thus highly compressible.) Afterwards you can combine these single-DTM bitmaps to a chunk-wise compressed byte-encoded DTM EGT. Since all files use the same indexing scheme this can be done sequentially, and you would never need to have more than one compression chunk in memory. The byte-encoded EGT is just written once. The much more highly compressed DTM bitmaps are written once, and then read once, and discarded.
noobpwnftw
Posts: 579
Joined: Sun Nov 08, 2015 11:10 pm

Re: Felicity EGTB generators and probers

Post by noobpwnftw »

Practically one would use various in RAM bitmaps to track "to do" positions and memory map the fat ply counting ones backed with files on disk, random access is likely inevitable but not too heavy with the help of those bitmaps.

Overengineering the indexing scheme is of little value, as we are already ahead of our times in terms of feasibility with common hardware. Nothing much has changed to make building 7-man chess tablebases any easier compared to a few years ago, while for xiangqi I'm still looking at a few years of compute into generating tablebases before available RAM runs out(there are much more combinations to work on).

Dealing with xiangqi rules involves multiple extra labeling passes during genration, it was quite an adventure: trial and error over multiple years to come across some bugs or corner cases. Code has been available for your reference, just FYI.
User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

noobpwnftw wrote: Mon Jul 01, 2024 12:15 pm Practically one would use various in RAM bitmaps to track "to do" positions and memory map the fat ply counting ones backed with files on disk, random access is likely inevitable but not too heavy with the help of those bitmaps.

Overengineering the indexing scheme is of little value, as we are already ahead of our times in terms of feasibility with common hardware. Nothing much has changed to make building 7-man chess tablebases any easier compared to a few years ago, while for xiangqi I'm still looking at a few years of compute into generating tablebases before available RAM runs out(there are much more combinations to work on).
That is a surprise information. As I know, you started generating your Xiangqi EGTB about the time of completing Syzygy 7 men. Thus you have been generating it for about 6 years. Is that correct? Another a few years to add that before stopping? So long period and you are amazing so patient!
noobpwnftw wrote: Mon Jul 01, 2024 12:15 pm Dealing with xiangqi rules involves multiple extra labeling passes during genration, it was quite an adventure: trial and error over multiple years to come across some bugs or corner cases. Code has been available for your reference, just FYI.
Thank you a lot, Bojun, for your contribution to the chess community in general, and for your offer to help/code for this project in particular.

Definitely, I will ask for your help/code, just a bit later. I have been chasing and verifying some of my ideas on the rule issues for Xiangqi EGTBs. I have been trying to compound them into a method/solution by coding and descriptions. Then I will look into other codes to understand their solutions and get some comparisons.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
syzygy
Posts: 5590
Joined: Tue Feb 28, 2012 11:56 pm

Re: Attempt 15: calculate index spaces for more men for chess

Post by syzygy »

phhnguyen wrote: Thu Jun 27, 2024 1:24 pm
hgm wrote: Sat Jun 01, 2024 11:05 am Of course there are cheaper methods of data storage than RAM. Disk drives of 1TB are sort of standard.
syzygy wrote: Mon Jun 10, 2024 10:26 pm Generating the 7-piece tables on a big PC with at least 1TB of RAM is possible, too.
...
16 GB of RAM is enough to generate the 6-piece tables "in RAM".
I agreed we could use some computers with small memory sizes to generate endgames with multi-times larger sizes. However, the question is if it is worth doing that.

Generating an EGTB using on-fly large data stored on hard disks may slow down 5-10 times, compared with one using RAM only. Instead of waiting for a few months, we may have to wait for years!
Only a few passes of sequential disk writes and reads would be needed. I am not talking about disk-based generation but about RAM-based generation.
User avatar
phhnguyen
Posts: 1474
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Attempt 19: more compact for chess index space

Post by phhnguyen »

Attempt 19: more compact for chess index space

We reduced index space for chess. For 5 men chess EGTB, the index space sank about 13%. The EGTB size becomes 7.5 GB, a 15.5% reduction (compared with the previous size of 8.88 GB). However, the time to generate is about 29% longer (22 hours vs 17 hours). We are still not clear yet why/where the speed was lost.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager