disk-based 8-piece TB generator

Discussion of chess software programming and technical issues.

Moderator: Ras

Koistinen
Posts: 34
Joined: Sun May 23, 2021 10:05 pm
Location: Stockholm, Sweden
Full name: Urban Koistinen

Re: disk-based 8-piece TB generator

Post by Koistinen »

syzygy wrote: Fri May 29, 2026 11:43 pm
Koistinen wrote: Fri May 29, 2026 4:15 pm Nice this is going again. I did some estimation for how long it would take using 150 * 24TB hdds for just white wins and estimate 5 hours per 8-man.

Still takes about 3 years just to compute them all! (5*4795h is about 3 years)
5 hours per 8-man on average is definitely way too optimistic. KNNNNvKRR didn't take too long, but it is 48x smaller than some others.
The io looks about right.
Looking at all moves each iteration it looks marginally right, possibly slower on a 32-core threadripper.
Then there is compression using rle of the zeroes between ones for sparse tables, that looks like it could be more expensive than I thought. (I did not think much about it.)
Also, it is only for computing the blessed wins/cursed losses results, not the more distant wins/losses.
To recover, I need to learn how to use gpu:s as they seem to be better than cpu:s for these operations. (32GB AMD compute card)
This opens up the possibility of using gpu friendly compression to compress the large white to move and wins table which might possibly reduce the io cost.
syzygy
Posts: 6028
Joined: Tue Feb 28, 2012 11:56 pm

Re: disk-based 8-piece TB generator

Post by syzygy »

Koistinen wrote: Wed Jun 10, 2026 9:27 pm To recover, I need to learn how to use gpu:s as they seem to be better than cpu:s for these operations. (32GB AMD compute card)
This opens up the possibility of using gpu friendly compression to compress the large white to move and wins table which might possibly reduce the io cost.
Using the GPU for compression/decompression of intermediate files is an interesting idea.

The current compression method I use is probably too expensive (Zstd level 6), but what is best will depend on what will turn out to be the main bottleneck.

My final compressed format is rather expensive to compute, but it results in tables that are cheap to probe, which is also worth something (though not really necessary for generation, for which it will be better to decompress the compressed subtables, in particular when pawns are involved).

If this turns out to be too limiting I might have to create a "dumb" compressed block format that takes less time to generate, which could be converted at a later time to a more "probable" format.
syzygy
Posts: 6028
Joined: Tue Feb 28, 2012 11:56 pm

Re: disk-based 8-piece TB generator

Post by syzygy »

KQQQvKQQQ took 6 hours to generate, 1 hours 20 minutes to compress (where it helps a bit that the table is symmetrical).
WDL tablebase file is 4.3 GB.
DTZ tablebase file is 4.9 GB.

Code: Select all

80510647360 positions are wins.
79671 positions are cursed wins.
17683275948 (15971007586) positions are draws.
1078 positions are blessed losses.
71869231796 positions are losses.
In agreement with Bourzutschky's numbers.

I have now written verification code that tests the consisteny of WDL and verifies the statistics of the decompressed DTZ table. (I did not yet run it on this table.)

Longest win for white (loss for black):
[pgn][Event "DTZ50-optimal mating line"]
[FEN "4q3/8/6k1/Q6q/Q7/5q2/8/1QK5 b - - 0 1"]
1...Kg7 2.Qba1+ Kh7 3.Qc2+ Kg8 4.Q1a2+ Kg7 5.Qab2+ Kg8 6.Qaa2+ Qff7 7.Qg2+ Kh7
8.Qab1+ Qfg6 9.Qgb7+ Kg8 10.Q1a2+ Qgf7 11.Q2g2+ Kh7 12.Qac2+ Kh6 13.Qcd2+ Kh7
14.Qd3+ Kh6 15.Qgd2+ Kg7 16.Qdb2+ Kg8 17.Q2g2+ Qfg6 18.Qa2+ Qgf7 19.Qg3+ Kh7
20.Qc2+ Kh6 21.Qd2+ Kh7 22.Qdd3+ Kh6 23.Qb6+ Qfg6 24.Qf4+ Kg7 25.Qbc7+ Qgf7
26.Qfg3+ Kh6 27.Qd2+ Kh7 28.Qdc2+ Kh6 29.Qgd6+ Kg7 30.Qg2+ Kh7 31.Qd3+ Kh6
32.Qgd2+ Kg7 33.Qb2+ Kg8 34.Qg2+ Qfg6 35.Qa2+ Qef7 36.Qcd8+ Kg7 37.Q3d4+ Kh6
38.Qad2+ Kh7 39.Q4h8#[/pgn]
Impressively SF does not need long to report mate in about 50 moves.

Longest cursed win for white:
[pgn][Event "DTZ50-optimal mating line"]
[FEN "6k1/8/q5q1/5Q2/8/4Q3/8/1KQ1q3 w - - 0 1"]
1.Qb3+ Kg7 2.Qb2+ Kh7 3.Qh2+ Kg8 4.Qb8+ Kh7 5.Qc7+ Kg8 6.Qc8+ Kh7 7.Qd7+ Kh8
8.Qxe1 Qgb6+ 9.Kc2 Qa2+ 10.Kd3 Qaa6+ 11.Ke4 Qc4+ 12.Kf3 Qbb3+ 13.Qe3 Qf1+
14.Ke4 Qg2+ 15.Qef3 Qg4+ 16.Ke5 Qb8+ 17.Kd5 Qa8+ 18.Kc5 Qa5+ 19.Kd6 Qb6+
20.Qfc6 Qb8+ 21.Kd5 Qb3+ 22.Ke5 Qb2+ 23.Ke6 Qg8+ 24.Ke7 Qbg7+ 25.Kd6 Qb8+
26.Kd5 Qb3+ 27.Qc4 Qf7+ 28.Qde6 Qfb7+ 29.Kd4 Qa7+ 30.Qfc5 Qd7+ 31.Qcd6 Qa7+
32.Kd5 Qf3+ 33.Qee4 Qff7+ 34.Qde6 Qfb7+ 35.Qec6 Qf7+ 36.Qee6 Qf3+ 37.Kd6 Qb8+
38.Kc5 Qa3+ 39.Kd4 Qf4+ 40.Qee4 Qf2+ 41.Ke5 Qe7+ 42.Kd5 Qef7+ 43.Qee6 Q7f3+
44.Kd6 Qf8+ 45.Ke5 Qe3+ 46.Q4e4 Qg5+ 47.Q6f5 Qfg7+ 48.Kd5 Qd2+ 49.Qd3 Qf7+
50.Kc5 Qa7+ 51.Kc4 Qaa2+ 52.Kd4 Qb4+ 53.Ke5 Qe7+ 54.Qfe6 Qa5+ 55.Kd4 Qeb4+
56.Qec4 Qa7+ 57.Kd5 Qf7+ 58.Qe6 Qbb7+ 59.Ke5 Qh5+ 60.Qdf5 Qh2+ 61.Qff4 Qh5+
62.Qef5 Qe7+ 63.Kd4 Qd1+ 64.Qfd3 Qa7+ 65.Kd5 Qb7+ 66.Kc5 Qh5+ 67.Qff5 Qe7+
68.Kb6 Qg6+ 69.Qc6 Qb4+ 70.Kc7 Qg3+ 71.Kd7 Qg7+ 72.Kc8 Kg8 73.Qb3+ Kh8 74.Qxb4
Qg6 75.Qh3+ Kg7 76.Qxg6+ Kxg6 77.Qf8 Kg5 78.Qff5#[/pgn]
User avatar
Ajedrecista
Posts: 2251
Joined: Wed Jul 13, 2011 9:04 pm
Location: Madrid, Spain.

Re: Disk-based 8-piece TB generator.

Post by Ajedrecista »

Hello Ronald:
syzygy wrote: Sat Jun 20, 2026 3:38 pm[...]

Code: Select all

80510647360 positions are wins.
79671 positions are cursed wins.
17683275948 (15971007586) positions are draws.
1078 positions are blessed losses.
71869231796 positions are losses.
In agreement with Bourzutschky's numbers.

[...]
Your numbers also agree with Kirill's: 80510647360 + 79671 + 17683275948 + 1078 + 71869231796 = 170063235853. :-)

------------
syzygy wrote: Sat Jun 20, 2026 3:38 pmKQQQvKQQQ took 6 hours to generate, 1 hours 20 minutes to compress (where it helps a bit that the table is symmetrical).
WDL tablebase file is 4.3 GB.
DTZ tablebase file is 4.9 GB.

[...]
Taking advantage of Kirill's data, I found the following distribution of 8-man endgames by pieces per side:

Code: Select all

8-man endgames:

------------------------------------------
AvB        # EG      %
------------------------------------------
4v4         630 (  25.00 % =  3/12 = 1/4 )
5v3        1050 (  41.67 % =  5/12       )
6v2         630 (  25.00 % =  3/12 = 1/4 )
7v1         210 (   8.33 % =  1/12       )
------------------------------------------
SUM        2520 ( 100.00 % = 12/12 = 1/1 )
In number of positions, which shall be roughly correlated to size of EGTB:

Code: Select all

8-man endgames:

--------------------------------------
AvB        # Positions           %
--------------------------------------
4v4        1.17329e+16         30.73 %
5v3        1.77959e+16         46.62 %
6v2        0.73552e+16         19.27 %
7v1        0.12922e+16          3.38 %
--------------------------------------
SUM        3.81763e+16        100.00 %

(Beware of roundings).
Last but not least, symmetrical 8-man endgames can only happen in 4v4. There are only 35 out of 630 of 4v4 (only 1/18 ~ 5.56% of 4v4; or 1/72  ~ 1.39% of the whole set):

Code: Select all

8-man symmetrical endgames:

------------------------------
Endgame            # Positions
------------------------------
kqqqkqqq          170063235853
kqqrkqqr         1854965830355
kqqbkqqb         2025928326072
kqqnkqqn         2157253830552
kqrrkqrr         2222255026372
kqrbkqrb         9795153736992
kqrnkqrn        10315589168492
kqbbkqbb         2616262536312
kqbnkqbn        11134285386816
kqnnkqnn         2930263172361
krrrkrrr          292976501052
krrbkrrb         2927254317292
krrnkrrn         3053674525863
krbbkrbb         3155996644052
krbnkrbn        13285404719716
krnnkrnn         3462859059723
kbbbkbbb          367877074303
kbbnkbbn         3520072792860
kbnnkbnn         3702448916268
knnnknnn          428547118433
kqqpkqqp         5170329017740
kqrpkqrp        24553459051137
kqbpkqbp        26637892584626
kqnpkqnp        28069200598364
krrpkrrp         7220504861317
krbpkrbp        31563662838963
krnpkrnp        32949517918464
kbbpkbbp         8408841201557
kbnpkbnp        35416072138858
knnpknnp         9235506139956
kqppkqpp         4139161281300
krppkrpp         4855950632029
kbppkbpp         5216820247688
knppknpp         5447420605464
kpppkppp          349857828160
For a total of circa 3.08653e+14 positions in symmetrical tables, around 0.8085 % of the total positions of 8-man endgames. Sadly, few help of the subset of symmetrical tables given their low weight on the whole set. There will be less and less symmetrical tables with higher total piece counts.

Regarding number of pawns: 20 of the symmetrical tables are pawnless, 10 have 2 pawns (1 per side), 4 have 4 pawns (2 per side) and 1 has 6 pawns (3 per side, the obvious kpppkppp).

Just to finish my analysis of Kirill's data: the symmetrical table with less positions is kqqqkqqq, while the symmetrical table with more positions is kbnpkbnp. Also including not symmetrical tables: kqqqqqqk has the least number of positions overall and krbnpkbn has the most number of positions overall.

Regards from Spain.

Ajedrecista.