Felicity EGTB generators and probers

Discussion of chess software programming and technical issues.

Moderators: hgm, Rebel, chrisw

Dann Corbit
Posts: 12615
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: Felicity EGTB generators and probers

Post by Dann Corbit »

Suggestion:
The compression speed is basically irrelevant since you only do it once. However, the decompression speed is extremely relevant since you do that every time you read a page of data.

Considering:
https://fuchsia.googlesource.com/third_ ... /README.md
I suggest using LZ4 HC -9
It gets a compression ratio of 2.72:1 and decompresses at 3240 MB/s, which is about half the speed of memcpy().
Taking ideas is not a vice, it is a virtue. We have another word for this. It is called learning.
But sharing ideas is an even greater virtue. We have another word for this. It is called teaching.
User avatar
phhnguyen
Posts: 1472
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

hgm wrote: Thu May 30, 2024 7:16 pm Can your generator handle KRC-KR?
Do you mean the endgame of three attackers RC vs R without any defenders?

Yes!

At the moment the generator can run with parameters to print all sub-endgames:

Code: Select all

fegtb -n krckr -subinfo 
It prints the below:

Code: Select all

  1)              kck            4'050
  2)              krk            4'050
  3)             krkc          364'500
  4)             krkr          364'500
  5)             krck          364'500
  6)            krckr       32'805'000

Total files: 6, total size: 33'906'600
The index space is only 33 M, which is very easy and fast to generate.

However, I did not add all the necessary code for Xiangqi since that is the later task. The code for permanent chasing/checking should be reviewed and tested thoroughly.

The program can print index space for RC vs R with all configurations of defenders.

Command line:

Code: Select all

fegtb -n rc-r -subinfo 
Prints out:

Code: Select all

…
448)     krcaabbkrabb  462'550'500'000
449)     krcaabbkraab  274'104'000'000
450)    krcaabbkraabb  805'180'500'000

Total files: 450, total size: 4'648'590'071'100
The last endgame with full defenders has an index space of over 800 G. All are about 4.6T. They require very expensive computers to generate and are out of reach for almost everyone, even for downloading.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
phhnguyen
Posts: 1472
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

Dann Corbit wrote: Fri May 31, 2024 1:19 am What is the maximum number of chessmen on the board that are possible to generate a tablebase for?
I use a 64-bit integer for indexing, thus it is the current limit of the generator. Perhaps, that is enough for 10-man chess EGTBs.

For Xiangqi, it can work with 3 attackers and full defenders (13 men).
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
phhnguyen
Posts: 1472
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Re: Felicity EGTB generators and probers

Post by phhnguyen »

Dann Corbit wrote: Fri May 31, 2024 1:26 am Suggestion:
The compression speed is basically irrelevant since you only do it once. However, the decompression speed is extremely relevant since you do that every time you read a page of data.

Considering:
https://fuchsia.googlesource.com/third_ ... /README.md
I suggest using LZ4 HC -9
It gets a compression ratio of 2.72:1 and decompresses at 3240 MB/s, which is about half the speed of memcpy().
Thanks for the information!

I will have some studies/attempts about compression!
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
hgm
Posts: 27931
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Felicity EGTB generators and probers

Post by hgm »

phhnguyen wrote: Fri May 31, 2024 2:22 am
hgm wrote: Thu May 30, 2024 7:16 pm Can your generator handle KRC-KR?
...

However, I did not add all the necessary code for Xiangqi since that is the later task. The code for permanent chasing/checking should be reviewed and tested thoroughly.
This is what I wondered about. I don't have that in my generator either, but would like to implement it. But I haven't thought out a good way to do it yet. Without the perpetual ban KRC-KR would be draw due to perpetual checking. It also contains some genuine draws due to 1-check, 1-chase perpetuals.

Perhaps I could do it through 'escape counting', i.e. reserve part of the DTM coding range for indicating the number of potentially forbidden moves to a non-won position, for positions that have no such moves that are always allowed. When the normal generation stagnates, these could then be resolved retrogradely for perpetuals.
User avatar
phhnguyen
Posts: 1472
Joined: Wed Apr 21, 2010 4:58 am
Location: Australia
Full name: Nguyen Hong Pham

Attempt 15: calculate index spaces for more men for chess

Post by phhnguyen »

Attempt 15: calculate index spaces for more men for chess


From a question (@Dann Corbit) about the limit of the generator, we have checked and added all necessary code/modifications to make sure the generator can work with more men for chess endgames.

When indexing, the generator combines similar chess pieces to save their index space. For example, two different chess pieces (say, a Rook and a Knight “rn”) have index space of 64 x 64 = 4,096, but the combination of two Rooks “rr” takes only 2016, saving more than half of the space. More similar pieces, much more savings. The previous code missed combinations for more than 3 pieces thus they can’t work with more than 3 attackers (5 men). Theoretically, a chess position can have a maximum of 10 similar pieces (say, 10 white Rooks). After adding the necessary code we could run the generator for more pieces of endgames.

That doesn’t mean we can build EGTBs easily for higher numbers of men since we are hardly limited by hardware and time to generate those endgames. For example, to build a 7-man EGTB, the Lomonosov team had to use supercomputers running multiple months to build it. We should use similar computing power to build our 7-man EGTB too which is too expensive and out of reach for almost all of us. Instead, we will get some interesting information with those EGTBs (higher numbers of pieces).

The generator can receive the name of endgames from command line arguments (e.g., “-n krkp”). If we give it the number of attackers, say -n 5, it will add all valid configurations of 5 attackers, plus all valid configurations of defenders (in the case of chess, it’s two Kings only, the total is 5+2 = 7 men).

With the argument “-subinfo” the generator will list all sub-endgames with their index size.


6 men

Command line:

Code: Select all

fegtb -n 4 -subinfo
The generator prints out:

Code: Select all

  1)              knk                        36'096
  2)              kbk                        36'096
  3)              krk                        36'096
  4)              kqk                        36'096
…
175)           krbkrb                 9'462'349'824
176)           krrknn                 2'292'240'384
177)           krrkbn                 4'657'250'304
…
509)           kpppkp                 1'499'355'648
510)           kppppk                   351'411'480

Total endgames: 510, total size: 3'421'720'926'408
One of the largest endgames: krkbnp, size: 22'724'739'072
The largest endgames have an index size of about 22 G, we need at least 22 x 2 = 44 GB RAM (to allocate two arrays for two sides, 1 byte per item). We may (highly likely) need to double that size (88 GB) if each item needs to be 2 bytes. The computers with that size of RAM are not cheap but still reachable for many of us.

In Attempt 12 we generated a 5-man EGTB, the index size is 21 G, the EGTB after compressing is 8.8 GB, and we have a compress ratio of 8.8 / 21 = 0.42. This 6-man has an index size of 3421 G. If we have a similar compression ratio, the size should be 3421 G x 0.42 = 1,436.82 GB = 1.44 TB. That size is larger than 1.2 TB of Nalimov 6-man but they all are still on the same level.

That size is about 163 times larger than a 5-man one. We have built 5-man in 17 hours. If we use the same computer, the new one may take 17 hours x 163 = 2,771 hours = 115.5 days.

Building 6-men EGTB is difficult, long time and not cheap. But it’s clearly feasible.


7 men

Code: Select all

1379)          kqbppkq    		534'031'368'192
1380)          kqbnpkp  		1'090'787'475'456
...
1467)          krpppkr  		127'945'015'296
1468)          krnppkp  		400'523'526'144
...
1510)          kppppkp                16'867'751'040
1511)          kpppppk                 3'092'421'024

Total endgames: 1511, total size: 439'858'437'385'704
One of the largest endgames: kqkrbnp, size: 1'454'383'300'608
We need a computer with 1.5x2x2 = 6 TB RAM (2 sides x 2 bytes) to fit the largest endgames. The EGTB index size is about 440 T, equal to 440 T x 0.42 = 185 TB. That is significantly larger than 140 TB of Lomonosov 7-man EGTB, but the number looks reasonable and on the same level too.

Look like we cannot build 7-men without having very expensive computers.

8 men

Code: Select all

581)          kqrkrrr        96'249'839'616
582)          kqrkqnn       298'064'019'456
583)          kqrkqbn       605'590'388'736
...
1568)         kqnkrnnn     6'159'989'735'424
1569)         kqnkrbnn    19'076'097'245'184
...
1657)         krbnkrnn    19'076'097'245'184
1658)         krbnkrbn    38'757'784'879'104
1659)         krbbknnn     3'031'869'947'904
...
4030)         kpppppkp               148'436'209'152
4031)         kppppppk                22'162'350'672

Total endgames: 4031, total size: 47'556'659'223'031'800
One of the largest endgames: kqnkrbnp, size: 93'080'531'238'912
EGTB size should be around 48 P x 0.42 = 20 PB.

We can’t build this EGTB with current hardware as well as bigger ones anyway.


9 men

Code: Select all


7192)        kqbnnnpkq           946'803'528'695'808
7193)        kqbnnnnkp           225'605'528'322'048
7194)        kqbbnnpkn         1'443'111'830'028'288
…
9750)        kppppppkp             1'063'792'832'256
9751)        kpppppppk               132'974'104'032

Total endgames: 9751, total size: 4'272'270'680'220'272'472
One of the largest endgames: kqbnkrbnp, size: 5'957'153'999'290'368

EGTB size should be around 4'272 P x 0.42 = 1794 PB


10 men

Some integer variants of 64-bit became overflow. We have to change them into unsigned 64-bit.

Code: Select all

19861)       kqrpppkbnn        16'507'977'653'551'104
19862)       kqrpppkbbn        16'507'977'653'551'104
19863)       kqrpppkbbb         5'330'701'117'292'544
19864)       kqrpppkrnn        16'507'977'653'551'104
19865)       kqrpppkrbn        33'540'018'089'754'624
…
21940)       kpppppppkp             6'382'756'993'536
21941)       kppppppppk               681'492'283'164

Total endgames: 21941, total size: 4'202'946'837'452'060'244
One of the largest endgames: kqrbnkrbnp, size: 381'257'855'954'583'552
The index size is about 4203 P, smaller than 9 men. We guest the variant for the total number (it is an unsigned 64-bit integer) is overflowed. However, the number of endgames and their sizes are still valid and good for reference.

11 men
We have to wait a very long time (over 1 hour) when the program is computing for 11 men. Since this computing is very rare to run, we didn’t try to speed up the code.

Code: Select all

28952)      kqrrbkqbbbp     1'908'755'913'850'748'928
28953)      kqrrbkqrnnp     5'910'986'055'795'867'648
28954)      kqrrbkqrbnp    12'009'622'462'569'381'888
28955)      kqrrbkqrbbp     5'910'986'055'795'867'648
...
46240)      kbpppppppkp           408'496'447'586'304
46241)      krpppppppkp           408'496'447'586'304
46242)      kqpppppppkp           408'496'447'586'304
46243)      knppppppppk            43'615'506'122'496
46244)      kbppppppppk            43'615'506'122'496
46245)      krppppppppk            43'615'506'122'496
46246)      kqppppppppk            43'615'506'122'496
46247)      kpppppkpppp           601'723'282'849'920
46248)      kppppppkppp           383'320'017'222'912
46249)      kpppppppkpp           149'994'789'348'096
46250)      kppppppppkp            32'711'629'591'872
Total endgames: 46250, total size: 6'501'485'980'035'256'340
One of the largest endgames: kqrbnpkrbnp, size: 18'300'377'085'820'010'496
A gain, the total size (6501 P) is incorrect because of being overflowed. The number of endgames and their sizes are still valid.

Fixing issues of being overflowed requires some good effort. Thus we stopped here.
https://banksiagui.com
The most features chess GUI, based on opensource Banksia - the chess tournament manager
User avatar
hgm
Posts: 27931
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Felicity EGTB generators and probers

Post by hgm »

Of course there are cheaper methods of data storage than RAM. Disk drives of 1TB are sort of standard.

End-games with Pawns can be factorized in so-called P-slices, which would need only a fraction of the memory to calculate, and can be generated one after the other. So in many cases the required hardware is much smaller than what you show here; it is just the time needed for running it that goes up with the increased number of men.

E.g. a (sloppy) 8-fold symmetric 7-men EGT has 640G positions (10 essentially different positions for the white king). During generation you can represent the EGT conveniently as a set of bitmaps, each 80GB. Doing one-sided building (won for white vs non-won for white) you would need one bitmap for the white-to-move positions, and one for the black-to-move positions, on disk. The number of lost positions will grow in each iteration, and you could keep all the btm maps, to take the differences later to deduce the DTM. The early iterations are very sparsely populated with lost positions, so they can be efficiently compressed.

The EGT can be considerd as a matrix of bits. E.g. for 4-vs-3 men there are 2560K white constellations and 256K black constellations. Those could be stored as contiguous 4K x 1KB blocks. Iterations either use only columns or rows of the matrix, and to load a complete set of black constellations (so all black moves stay within the set) you need 256 blocks of 4MB, or 1GB. That fits easily in the RAM of a 4GB machine, and contains the black rows for 4K white constellations. A single row takes only 32KB, and easily fits in cache. You make all black moves in it to get retrogradely from the wtm map to the btm move before sending it back to RAM.

Because of the locality of king moves you never need more than 4 white king locations present in memory simultaneously, so 1M constellations. That requires loading of 256 (non-contiguous) blocks, which contain white partial columns for 1K black constellations. Again 1GB, nicely buffered in RAM.

So nowadays you don't really need a super-computer to do a 7-men. A 4GB machine with a 1 TB HD should do.
User avatar
hgm
Posts: 27931
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Felicity EGTB generators and probers

Post by hgm »

I think I have a viable algorithm for handling perpetuals. The basic idea is, when generating white wins, to specially mark btm positions that have no 'safe' moves to a non-won wtm position left, but still have some such unsafe moves. as candidate losses. (Where 'safe' means non-checking and not attacking any unprotected piece with non-K or non-P, and no R with H or C.) And then mark all wtm predecessors of such marked btm positions as candidate wins.

If an iteration added candidate losses, we then reaxamine all those, to see if any of their unsafe 'escapes' goes to a wtm position not marked as candidate win. If so, their status is chaned to 'cleared loss', and that of their predecessors to 'cleared wins'. This can lead to further clearing of their btm predecessors. If there is nothing left to clear, the remaining canditates are changed into wins and losses, as these correspond to checking/chasing cycles without exit. The cleared positions then revert back to candidates, and we continue with the next iteration.

I usually represent each board position as a byte, one bit for indicating won wtm positions, and 7 bits for the DTM of the lost btm positions. (That allows DTM up to 128, which is rathere generous.) But we need extra coding for the candidate and cleared marking. Wtm positions where white is checked can never be lost with btm, though, as black would capture the King. So we can use that to allow encoding more info of these wtm positions. For chasing this is less clear; if the chasing side (i.e. b.ack) had the move here he could convert to a simpler end-game by capturing the chased piece. Usually that would be to a non-lost position, as he now gained a piece, meaning there is also no DTM to encode in the btm position before the capture.

So we can reserve codes 2-201 for combinations of DTM 1-100 for btm, and the wtm won/non-won state, while codes 0 and 1 would indicate the wtm state for positions that are still undecided with btm. Codes 202-255 would then be available for marking positions that are parts of potential checking/chasing loops.

The generation would start with marking all positions where white is 'threatened', i.e. in check or has a hanging piece. The code in the 204-253 range should indicate which piece is threatened, to distinguish and allow 1-chases-many loops. During iteration this makes it easy to see whether black moves are safe by probing the EGT. Codes 202-205 are used to indicate btm candidate and cleared positions, with wtm won/non-won state. To avoid having to revert cleared positions to candidates every iteration, we swap the meaning of those codes every iteration.

If a so-far undecided btm position at some point during generation loses its last safe escape, we give it the current DTM code for candidate loss(without altering the wtm state). All wtm predecessors that were not yet won, and have threatened codes, will get those codes changed to the candidate form for the specific threat(s).

Clearing will have to be done in a way that is specific for the threatened piece. Candidate losses that have a move to a cleared win

* to be continued...
syzygy
Posts: 5587
Joined: Tue Feb 28, 2012 11:56 pm

Re: Attempt 15: calculate index spaces for more men for chess

Post by syzygy »

phhnguyen wrote: Sat Jun 01, 2024 9:59 amThat doesn’t mean we can build EGTBs easily for higher numbers of men since we are hardly limited by hardware and time to generate those endgames. For example, to build a 7-man EGTB, the Lomonosov team had to use supercomputers running multiple months to build it. We should use similar computing power to build our 7-man EGTB too which is too expensive and out of reach for almost all of us. Instead, we will get some interesting information with those EGTBs (higher numbers of pieces).
The Lomonosov team implemented a cluster-based generator (because they had a cluster to run it on).

Generating the 7-piece tables on a big PC with at least 1TB of RAM is possible, too.
The largest endgames have an index size of about 22 G, we need at least 22 x 2 = 44 GB RAM (to allocate two arrays for two sides, 1 byte per item). We may (highly likely) need to double that size (88 GB) if each item needs to be 2 bytes. The computers with that size of RAM are not cheap but still reachable for many of us.
You can avoid doubling by saving the table to disk and then reducing the range of values before you start overflowing. (You only need to save the information that you "throw away" by reducing the range.)

16 GB of RAM is enough to generate the 6-piece tables "in RAM".
syzygy
Posts: 5587
Joined: Tue Feb 28, 2012 11:56 pm

Re: Felicity EGTB generators and probers

Post by syzygy »

hgm wrote: Sat Jun 01, 2024 11:05 amSo nowadays you don't really need a super-computer to do a 7-men. A 4GB machine with a 1 TB HD should do.
I wonder if anyone has ever managed to implement this, though :-)

Perhaps the generator of Marc Bourzutschky and Yakov Konoval uses such an approach, but I don't think they can do pawnful tables.

(I am not saying it cannot be done.)