disk-based 8-piece TB generator

Discussion of chess software programming and technical issues.

Moderator: Ras

syzygy
Posts: 6007
Joined: Tue Feb 28, 2012 11:56 pm

disk-based 8-piece TB generator

Post by syzygy »

viewtopic.php?t=85908&start=60
Nordlandia wrote:How feasible is undertaking 8-man tables or partly about let's say 2030 around. Is the equipment up to snuff, or do we need something like Lomonosov II SuperComputer contraption or deemed obsolete by 2030 to commence 8-man EGTBs generation.
I don't see this happening before 2030 is over, but at least I have now written a generator that in principle can do this on "normal" hardware (and today I have added a README):
https://github.com/syzygy1/tb8gen

The generator not only generates the tablebase but also compresses it, essentially using the same method as used for the regular Syzygy TBs. However, the 7- and 8-piece tablebase files are internally split into (many) more separately compressed tables, which means the existing Syzygy probing code cannot read those files. (On the plus side, this new internal layout gives an improved compression ratio.) New probing code is available:
https://github.com/syzygy1/probetool/tree/new_format

Hardware requirements are (1) lots of disk space, and (2) 128 GB of RAM for the largest pawnless 8-piece tablebases (those without any identical pieces of the same color; there are not so many of them among the 8-piece tablebases) and 64 GB or less once there are identical pieces of the same color. For tables with one pawn, 32 GB RAM should be enough.

The generator is Linux-only for now and will need an x86-64 CPU with fast pext/pdep instructions (so Zen3+ or Intel).

The amount of disk space needed during generation is much less than I had originally feared (thanks to regular data compression), but for 8-piece tables it will probably still be in the many terabytes.

The very largest compressed 8-piece TB files will probably be around 5 TB in size, the complete set (WDL+DTZ) somewhere between 1.5 and 2 PB. Generation of a single pawnless table will probably take multiple weeks if not months, and pawnful tables are much worse. So realistically this is not going to happen anytime soon, if ever.

While the generator is running, it can be interrupted at any time using CTRL-C (or simply killing it). Restarting the generator will then pick up where it left off.

Things still to be improved:
- The threading code should be tuned. It now splits the K-slices in RAM into too few work blocks, and it splits the data written/read from/to disk in blocks that are too small.
- Some performance improvement should be possible by overlapping disk I/O with computation, in particular when calculating predecessors and verifying successors.
- For very sparse K-slices written to disk, the generator should compress the indices of the 1-bits instead of the bits themselves (with various speed benefits).
- I still have to write a RAM-based generator for 8-piece TBs with 2+ pawns.
syzygy
Posts: 6007
Joined: Tue Feb 28, 2012 11:56 pm

Re: disk-based 8-piece TB generator

Post by syzygy »

I have now generated two 8-piece tablebases.

The first covers the somewhat unlikely case of KNNNNNNvK.
Longest win for white according to the DTZ metric is 21 ply:
[pgn][Event ""]
[FEN "N1N5/1N6/N7/8/N7/8/N7/1K1k4 b - - 0 1"]
1...Ke1 2.Na8b6 Kd1 3.Nc4 Ke1 4.Nd2 Ke2 5.Nf3 Kd1 6.Ne1 Kd2 7.Nbc5 Ke2 8.N4c3+
Kf1 9.Ne2 Kf2 10.Ka1 Kf1 11.Ncd3 Kxe2 12.Nc3+ Kf1 13.Ne2 Kxe2 14.Kb1 Kf1 15.Nf3
Ke2 16.Nd2 Kd1 17.Nb6 Ke2 18.Nd5 Kd1 19.N5f4 Kxd2 20.Nac5 Ke3 21.Nfe6 Kd2
22.Ne5 Kd1 23.Ne4 Ke2 24.Kc1 Ke3 25.Ng3 Kf2 26.Nf5 Ke1 27.Nf4 Kf1 28.Ned3 Kg1
29.Ng3 Kh2 30.Nge2 Kh1 31.Ne1 Kh2 32.Nf3+ Kh1 33.Ng3#[/pgn]
So in this sequence of moves, black's aim is to maximize (and white's to minimize) the number of moves to either mate or a capture.

The WDL file is 47,975,056 bytes = 45.75 MB. The DTZ file is 1,761,580,304 bytes = 1.64 GB.
This includes a 6!=720-fold reduction in size thanks to the 6 white knights (but with 8 pieces, any pawnless KX..vK table will have at least 2 pairs of identical white pieces (4x reduction) or a triple of of identical white pieces (6x reduction)).

WDL is much smaller than DTZ in this case because the material is very unbalanced, which means win/draw/loss is very predictable (-> WDL compresses better) and almost all positions have a DTZ value, only the draws being "don't care" (-> DTZ compresses worse).

The second is the more interesting (even though unlikely to ever happen in a real game) KNNNNvKRR.
Longest win for white is 48 ply:
[pgn][Event ""]
[FEN "r5N1/8/4N2k/5N2/4N3/7r/8/2K5 b - - 0 1"]
1...Kh7 2.Nge7 Kh8 3.Ng6+ Kh7 4.Ne5 Kh8 5.Nf7+ Kg8 6.Nfg5 Rh1+ 7.Kb2 Kh8 8.Nf7+
Kg8 9.Ne5 Kh8 10.Ng6+ Kg8 11.Nge7+ Kf7 12.N6g5+ Kf8 13.Ng6+ Ke8 14.Nf6+ Kd8
15.Ne6+ Kc8 16.Nge7+ Kb8 17.Nd7+ Ka7 18.Nd6 Rh2+ 19.Kc1 Rh1+ 20.Kd2 Rh2+ 21.Ke3
Rc2 22.Kd3 Rc1 23.Kd2 Ra1 24.Nc6+ Ka6 25.Nec5#[/pgn]
Longest win for black is 63 ply:
[pgn][Event ""]
[FEN "N7/8/3N4/8/6r1/NN1r4/3k4/1K6 b - - 0 1"]
1...Kc3 2.Na1 Rb4+ 3.Kc1 Rh3 4.Kd1 Rd4+ 5.Ke1 Re3+ 6.Kf2 Kd3 7.Nb3 Rf4+ 8.Kg2
Ke2 9.Nc1+ Kd1 10.Ndc4 Re7 11.Nb2+ Kd2 12.Nb1+ Kc2 13.Na3+ Kc3 14.Nb1+ Kd4
15.Nb3+ Ke3 16.Nd1+ Ke2 17.Ndc3+ Ke1 18.Nc5 Rf2+ 19.Kh3 Rh7+ 20.Kg3 Rg7+ 21.Kh3
Rfg2 22.N3e4 Rg1 23.Kh2 R7g2+ 24.Kh3 Rg8 25.Kh2 R1g2+ 26.Kh3 R2g6 27.Nd3+ Kd1
28.Nbc3+ Kc2 29.Ne1+ Kb2 30.Nd3+ Ka1 31.Nf4 Rh6+ 32.Nh5 Rhxh5#[/pgn]
The statistics of the table are identical to those generated by Marc Bourzutschky (except that I keep statistics per ply and he per move).

For KNNNNvKRR, the WDL file is 20,363,170,448 bytes = 18.96 GB and the DTZ file is 5,118,495,312 = 4.77 GB.
This includes a 24x2 = 48-fold size reduction resulting from the NNNN and RR symmetries.

DTZ is relatively small because there are many draws:

Code: Select all

wtm:
59408961325 positions are wins.
221896147707 (97316801998) positions are draws.
13372865143 positions are losses.

btm:
74472858997 positions are wins.
193049671544 (142464119987) positions are draws.
4397957478 positions are losses.
The "(97316801998)" and "(142464119987)" are draws by an immediate capture. These can be encoded by the compression algorithm as a loss in the WDL table. In general that is not a win here because draws including capture draws are more dominant over losses (43.9x), than capture-draws plus losses over non-capture-draws (2.9x). (Still the compression algorithm may have encoded some capture-draws as losses where this happened to create a long run of losses.)

Considering the size of KNNNNvKRR.rtbw, a somewhat balanced pawnless table without any identical pieces of the same color could be expected to take up about 48 x 20 = 960 GB. This is only a very rough estimate because all tables are different.

Total generation time for KNNNNvKRR was about 5 hours 40 minutes, where I used a RAM disk to store all intermediary files. This fit in 256 GB, though with very little room to spare. Larger tables will clearly need disk I/O, which will slow things down, but probably not to a crawl even if a HDD is used. And an SSD will probably survive a fair number of tablebase generations (contrary to my initial expectations).

While the generation was running, I realised that the temporary files holding the "potential losses" can be dropped more quickly if the pass that calculates btm losses-in-(n+1) from wtm wins-in-n alternates between finding potential losses and verifying potential losses for every of the 10 positions of the white king. This reduces the maximum number of potential loss files on disk from 462 to 58. With just 58 it should be possible to hold them all (compressed) in RAM, at least for the majority of iterations.

Anothing thing that became clear is that there should be quite a bit to gain from tuning the compression of intermediate files (and adding a new compression method for very sparse bitmaps), in particular during later iterations. And earlier iterations could perhaps benefit from a forward pass instead of a retro pass.

Roughly half the generation time went into the final compression phase. I expect this phase to scale well for larger tables (and this phase only involves very little I/O). Still it would be nice if I could speed this up.

If I can speed up the later iterations by working on the compression of intermediate files, then 6 hours * 50 = 12.5 days per largest pawnless table would seem reasonable (on a system like mine, 9950x3d, 256 GB). Perhaps some days have to be added if an HDD is used. And perhaps improvements can still shave quite a bit of this time.

So this is already looking quite doable, if we ignore the major problem of storing (and distributing) all the generated files.

For the largest pawnful tables (with 1 pawn), the generation time might be about a month.

Tables with 2+ pawns can be generated in RAM (per PP-slice or PvP-slice), which will be faster. But the compression phase will take about as long as for tables with 1 pawn. (Of course tables with two pawns of the same color start are about half the size, and pawns in general have fewer squares to stand on, which also helps.)
TommyTC
Posts: 39
Joined: Thu Mar 30, 2017 8:52 am

Re: disk-based 8-piece TB generator

Post by TommyTC »

[d]r5N1/8/4N2k/5N2/4N3/7r/8/2K5 b - - 0 1

Hmmm. This position can never be reached from a legal game of chess. What was White's last move?
User avatar
Ajedrecista
Posts: 2246
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Disk-based 8-piece TB generator.

Post by Ajedrecista »

Hello Tommy:
TommyTC wrote: Thu May 28, 2026 7:04 pm [d]r5N1/8/4N2k/5N2/4N3/7r/8/2K5 b - - 0 1

Hmmm. This position can never be reached from a legal game of chess. What was White's last move?
Good catch: two knights (f5 and g8) can not give check simultaneously in a legal position. This does not blur Ronald's exceptional work over the years and I encourage him once again! :-)

Regards from Spain.

Ajedrecista.
syzygy
Posts: 6007
Joined: Tue Feb 28, 2012 11:56 pm

Re: disk-based 8-piece TB generator

Post by syzygy »

Yes, I had also noticed that this position was illegal.
It would be possible to set all positions with the king being checked by two knights as illegal, but it would mean that the statistics would deviate from those of Bourzutschky (and probably from the NULP legal position counts computed by Kiril Kryukov), and it is helpful to be able to compare results. Also, where do you stop once you start removing them...
chrisw
Posts: 4967
Joined: Tue Apr 03, 2012 4:28 pm
Location: Anywhere but the Western Empire
Full name: Christopher Whittington

Re: disk-based 8-piece TB generator

Post by chrisw »

syzygy wrote: Fri May 29, 2026 12:05 am Yes, I had also noticed that this position was illegal.
It would be possible to set all positions with the king being checked by two knights as illegal, but it would mean that the statistics would deviate from those of Bourzutschky (and probably from the NULP legal position counts computed by Kiril Kryukov), and it is helpful to be able to compare results. Also, where do you stop once you start removing them...
That’s fun!
Is this exhaustive rule set ….
Triple check impossible
For double checks ….
One check by N (including promotions) other check by QRB
One check by pawn (including ep) other check by QR on file, or, if ep, by QR on rank
One check by B, other check by QR on manhattan
One check by R, other check by QB on diagonal
One check by Q, covered by other cases.
I think this can be shrunk more if the check piece is the moving piece.
Koistinen
Posts: 32
Joined: Sun May 23, 2021 10:05 pm
Location: Stockholm, Sweden
Full name: Urban Koistinen

Re: disk-based 8-piece TB generator

Post by Koistinen »

Nice this is going again. I did some estimation for how long it would take using 150 * 24TB hdds for just white wins and estimate 5 hours per 8-man.
Still takes about 3 years just to compute them all! (5*4795h is about 3 years)

Code in Odin but main result included so you don't have to compile/run it:

Code: Select all

package eg_counter

import "core:fmt"

P:=[10]rune{'Q','R', 'B', 'N', 'P', 'q', 'r', 'b', 'n', 'p',}

main :: proc() {
  for n := 1; n <= 30; n += 1 {
    count : int
    eg: [10]int
    eg[0] = n
    for eg[5] != n {
      /*
      for i:=0;i<10;i+=1 {
        for k:=0;k<eg[i];k+=1 do fmt.print(P[i])
      }
      fmt.println()
      */
      count += 1
      i := 9
      for eg[i] == 0 do i -= 1
      if i == 9 {
        m := eg[i]
	eg[i] = 0
	for eg[i] == 0 do i -= 1
	eg[i+1] += m
      }
      eg[i] -= 1
      eg[i+1] += 1
    }
    fmt.println(n+2, ": ", count)
  }
}

// 3 :  5
// 4 :  40
// 5 :  185
// 6 :  645
// 7 :  1876
// 8 :  4795
// 9 :  11110
// 10 :  23815
// 11 :  47905
// 12 :  91377
// 13 :  166595
// 14 :  292110
// 15 :  495040
// 16 :  814130
// 17 :  1303628
// 18 :  2038130
// 19 :  3118565
// 20 :  4679510
// 21 :  6898045
// 22 :  10004379
// 23 :  14294500
// 24 :  20145125
// 25 :  28031250
// 26 :  38546625
// 27 :  52427505
// 28 :  70580055
// 29 :  94111815
// 30 :  124367660
// 31 :  162970720
// 32 :  211868756
My plans are to give it a go this summer and if I fail I'll try to pass it on for someone else to do.
User avatar
Ajedrecista
Posts: 2246
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Disk-based 8-piece TB generator.

Post by Ajedrecista »

Hello Urban:
Koistinen wrote: Fri May 29, 2026 4:15 pm Nice this is going again. I did some estimation for how long it would take using 150 * 24TB hdds for just white wins and estimate 5 hours per 8-man.
Still takes about 3 years just to compute them all! (5*4795h is about 3 years)

Code in Odin but main result included so you don't have to compile/run it:

Code: Select all

package eg_counter

import "core:fmt"

P:=[10]rune{'Q','R', 'B', 'N', 'P', 'q', 'r', 'b', 'n', 'p',}

main :: proc() {
  for n := 1; n <= 30; n += 1 {
    count : int
    eg: [10]int
    eg[0] = n
    for eg[5] != n {
      /*
      for i:=0;i<10;i+=1 {
        for k:=0;k<eg[i];k+=1 do fmt.print(P[i])
      }
      fmt.println()
      */
      count += 1
      i := 9
      for eg[i] == 0 do i -= 1
      if i == 9 {
        m := eg[i]
	eg[i] = 0
	for eg[i] == 0 do i -= 1
	eg[i+1] += m
      }
      eg[i] -= 1
      eg[i+1] += 1
    }
    fmt.println(n+2, ": ", count)
  }
}

// 3 :  5
// 4 :  40
// 5 :  185
// 6 :  645
// 7 :  1876
// 8 :  4795
// 9 :  11110
// 10 :  23815
// 11 :  47905
// 12 :  91377
// 13 :  166595
// 14 :  292110
// 15 :  495040
// 16 :  814130
// 17 :  1303628
// 18 :  2038130
// 19 :  3118565
// 20 :  4679510
// 21 :  6898045
// 22 :  10004379
// 23 :  14294500
// 24 :  20145125
// 25 :  28031250
// 26 :  38546625
// 27 :  52427505
// 28 :  70580055
// 29 :  94111815
// 30 :  124367660
// 31 :  162970720
// 32 :  211868756
My plans are to give it a go this summer and if I fail I'll try to pass it on for someone else to do.
Glad to see you posting here! I will warn with my best intention about something that I do not understand:

I understand your count of 5 endgames for 3 pieces (KQk, KRk, KBk, KNk and KPk), but I do not understand the counts for more pieces. I am absolutely sure that you know about 'Number of Unique Legal Positions (NULP) in chess endgames' project. There is a research about counting chess endgames and those results do not match with yours. Looking into 4 pieces, there are two counts on NULP: 30 by endgame and 55 by endgame + side to move (just open the links and search a hyphen). I can work out 5*2 + 30 = 40, so a kind of accumulated count, with the *2 being {KQk, KRk, KBk, KNk, KPk} and swapping colours {Kkq, Kkr, Kkb, Kkn, Kkp}.

Going up to 6 pieces, your result is 645 while NULP's is 365 by endgame (link). Going to the Syzygy EGTB server on Lichess, there are 365 files in either DTZ, DTZ-NR and WDL metrics (just search .r in each folder). Again, I can work out 185*2 + 365 = 645. However, with the same logic, I can not work out your count for 8 pieces: 1876*2 + 1001 = 4753 < 4795. Could you explain your reasoning and/or check your code, please? Thank you in advance.

Since I am not a programmer, I hardly understand your code: I understand that you start listing each piece by type and colour except the kings, because there are always one per side ('K' and 'k', fine for me); you loop from 1 to 30, this is, without kings, as before (fine for me again); and fireworks start for me, probably with something commented between /* and */; more fireworks after the block comment; and printing each count for n+2 (including both kings, as seen before, which is fine for me). NULP's result is 2520 for 8 pieces. If that is confirmed, your forecast will be down to 12600 hours ~ 17 to 18 months, if you keep the insane-to-me average of 5 hours/endgame (only 5 hours to generate a whole 8-man EGTB? :shock:). 18 months could be reduced to only 3 if six people collaborate and split in blocks of endgames... OK, I know, don't count your chickens before they hatch.

Other issues raise: electricity bills while generating the EGTB, storage (probably around 2 PB without possible intermediate files) and distribution (here is the story about the generation and distribution of some 8-man EGTB). After all these obvious things have been said... I wish you very good luck! Please keep up your great work. :-)

Regards from Spain.

Ajedrecista.
Koistinen
Posts: 32
Joined: Sun May 23, 2021 10:05 pm
Location: Stockholm, Sweden
Full name: Urban Koistinen

Re: Disk-based 8-piece TB generator.

Post by Koistinen »

Ajedrecista wrote: Fri May 29, 2026 8:24 pm Hello Urban:
Koistinen wrote: Fri May 29, 2026 4:15 pm // 3 : 5
// 4 : 40
// 5 : 185
Glad to see you posting here! I will warn with my best intention about something that I do not understand:

I understand your count of 5 endgames for 3 pieces (KQk, KRk, KBk, KNk and KPk), but I do not understand the counts for more pieces. I am absolutely sure that you know about 'Number of Unique Legal Positions (NULP) in chess endgames' project. There is a research about counting chess endgames and those results do not match with yours. Looking into 4 pieces, there are two counts on NULP: 30 by endgame and 55 by endgame + side to move (just open the links and search a hyphen). I can work out 5*2 + 30 = 40, so a kind of accumulated count, with the *2 being {KQk, KRk, KBk, KNk, KPk} and swapping colours {Kkq, Kkr, Kkb, Kkn, Kkp}.

Going up to 6 pieces, your result is 645 while NULP's is 365 by endgame (link). Going to the Syzygy EGTB server on Lichess, there are 365 files in either DTZ, DTZ-NR and WDL metrics (just search .r in each folder). Again, I can work out 185*2 + 365 = 645. However, with the same logic, I can not work out your count for 8 pieces: 1876*2 + 1001 = 4753 < 4795. Could you explain your reasoning and/or check your code, please? Thank you in advance.

Since I am not a programmer, I hardly understand your code: I understand that you start listing each piece by type and colour except the kings, because there are always one per side ('K' and 'k', fine for me); you loop from 1 to 30, this is, without kings, as before (fine for me again); and fireworks start for me, probably with something commented between /* and */; more fireworks after the block comment; and printing each count for n+2 (including both kings, as seen before, which is fine for me). NULP's result is 2520 for 8 pieces. If that is confirmed, your forecast will be down to 12600 hours ~ 17 to 18 months, if you keep the insane-to-me average of 5 hours/endgame (only 5 hours to generate a whole 8-man EGTB? :shock:). 18 months could be reduced to only 3 if six people collaborate and split in blocks of endgames... OK, I know, don't count your chickens before they hatch.

Other issues raise: electricity bills while generating the EGTB, storage (probably around 2 PB without possible intermediate files) and distribution (here is the story about the generation and distribution of some 8-man EGTB). After all these obvious things have been said... I wish you very good luck! Please keep up your great work. :-)

Regards from Spain.

Ajedrecista.
I only do W/DL.
This simplifies each separate calculation but means I need to do Qr and Rq as two separate computations.

The commented out code is good for debugging as it prints out each class to be computed.

The 5h/computation is a very rough guesstimate mainly based on i/o dominated by the wtm and winning bitmap being read and written 3 times per iteration. (.3GB/s per hdd)

My new flat does not meter electricity! (Cost of the hdds dominate anyway.)
syzygy
Posts: 6007
Joined: Tue Feb 28, 2012 11:56 pm

Re: disk-based 8-piece TB generator

Post by syzygy »

Koistinen wrote: Fri May 29, 2026 4:15 pm Nice this is going again. I did some estimation for how long it would take using 150 * 24TB hdds for just white wins and estimate 5 hours per 8-man.

Still takes about 3 years just to compute them all! (5*4795h is about 3 years)
5 hours per 8-man on average is definitely way too optimistic. KNNNNvKRR didn't take too long, but it is 48x smaller than some others.

But 4795 seems to be overcounting it.

The total number of tables for 3-8 pieces:

Code: Select all

3-piece		5
4-piece		30
5-piece		110
6-piece		365
7-piece		1001
8-piece		2520
The 8-piece tables can be nicely split into:
- 868 pawnless tables
- 792 tables with exactly one pawn
- 860 tables with 2+ pawns.

The 868 pawnless tables are now the "easy" ones.
The 792 tables with one pawn are the hard ones (although they need the least RAM to generate). They will take up the most space.
The 860 tables with 2+ pawns can be generated in reasonable RAM, which should mean that they can be generated relatively quickly.

All the 8-men pawnless tables can be generated independently and thus in parallel (you just need to download the right 7-men WDL tables, which isn't a big deal).
When that is done, all 8-men tables with 1 pawn can be generated independently (but you need to download the right 8-men WDL tables, which starts to be an issue).
What that is done, all 8-men tables with 2 pawns, then 3 pawns, etc.

The number of 8-men subtables you need to generate a specific pawnful table isn't too large, but if each table is multiple TBs, it is a lot of data. Still, this seems within reach if you don't live in a backward country like Germany.

The real problem is: who wants to host 2 PB of data.