On-demand tablebase generation

Discussion of chess software programming and technical issues.

Moderator: Ras

syzygy
Posts: 5780
Joined: Tue Feb 28, 2012 11:56 pm

Re: On-demand tablebase generation

Post by syzygy »

hgm wrote:Only extremely rarely. The presence of the Pawn breaks the symmetry. With a single Pawn that is a certainty on a board of even width. With an even number of Pawns you can indeed have symmetric Pawn structures, but they are rare, and the more Pawns you have, the more exceptional they are.

So the symmetry plays no role during generation. It is just that you only have to generate for half the Pawn structures (e.g. only a-d Pawns for single-Pawn end-games). The other half you can simply reflect after the generation. (Or, probably more efficiently, reflect in the probing code.)
I would say you are making use here of the "rather simple" 2-fold symmetry that Thomas refers to.
tpetzke
Posts: 686
Joined: Thu Mar 03, 2011 4:57 pm
Location: Germany

Re: On-demand tablebase generation

Post by tpetzke »

Only extremely rarely. The presence of the Pawn breaks the symmetry
The presence of the pawn reduces the 8 fold to 2 fold. But it still saves 50% of the positions, not rarely but always.

I only consider then positions with the white king at File A - D, all positions with the WK at file E - H are mirrored vertically.
Thomas...

=======
http://macechess.blogspot.com - iCE Chess Engine
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: On-demand tablebase generation

Post by hgm »

But you cannot exploit that in a generator. The number of positions that need to be taken into account simultaneously when there is a Pawn present is just as large in the presence of left-right asymmetric pieces as it would be in their absence. With 8-fold symmetry you can really save 87% on storage requirement. But you will move out of the canonical symmetry sector all the time, and will have to remap those moves to the canonical sector all the time. With Pawns the mirror image lives a separate life. So it doesn't affect the generation whether it exists or not. Only the probing.

Basing symmetry on the Kinglocation is of course not a good thing to do. You should base it on the Pawn.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: On-demand tablebase generation

Post by Evert »

hgm wrote:This is promising:

I get all 464 mates of KBNK (plus all 4.4M won wtm positions) in 3ms now, for completely unsymmetric generation (2MB bitmaps). My old unsymmetric generator took 160ms for this.

Now let's see how fast I can iterate. (Must debug that first. And currently it still does it the slow way, scanning the entire bitmap for set bits, rather than using a directory map to remember where the words with set bits are.)
Pretty impressive!
I haven't tried to optimise the initialisation at all (the total time will be dominated by the iterations anyway) and mine currently takes 35ms for KBNK, with 8-fold symmetry. I guess I should try to take that down a bit at some point.

Building the full tablebase (double sided) takes about a second now, but I'm hopeful that I can get that down to under a second by recording the position of the second piece in "newly won/lost" positions. I currently only do that for the first piece (so I don't have to test every position while looping over the board), which got me a nice speedup.

For 3-men my generator now takes ~10ms to build the tablebase, which at the moment I don't think I can get much faster. Something may jump out at me if I let the code lie for a bit.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: On-demand tablebase generation

Post by hgm »

I am still struggling with the iteration. The first two come out right, but I get 3192 mate-in-3 positions in stead of 3216. I went through the tedious process of printing all keys of mated-in-2 and mate-in-3 in the same format for my old, correct generator and the new one, so that with sort and diff I can figure out what is missing. Turns out 36 are missing (all parents of the same four symmetric mated-in-2 position on a1/a2 and h1/h2), while there are two groups of 8 extra.

Anyway, the (faulty) bulding completes in 0.78 sec now. When I was using clock() to time this, it was only 0.53 sec, but clock() was not precise enough to time individual iterations. So I switched to gettimeofday(), and then it suddenly took longer! I guess clock() reports user CPU time, and apparently a lot of CPU time gets lost while waiting for memory allocation.

I still hope to speed this up, by using a scan directory, and a more efficient generation of black-king retro-moves. But I am not sure how much there is to gain. Scans that hardly decide any positions are already very fast compared to those that do a lot of retro-moving (7 ms vs 50 ms).
tpetzke
Posts: 686
Joined: Thu Mar 03, 2011 4:57 pm
Location: Germany

Re: On-demand tablebase generation

Post by tpetzke »

For 3-men my generator now takes ~10ms to build the tablebase, which at the moment I don't think I can get much faster
I don't know your HW but it seems much faster than mine. Mine takes (32 bit, Laptop) about 0,06 sec for KRK and KQK and 0,09 sec for KPK.

Do you have anything in to avoid looping over all positions but knowing what positions to look for in each iteration ?

Thomas...
Thomas...

=======
http://macechess.blogspot.com - iCE Chess Engine
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: On-demand tablebase generation

Post by Evert »

tpetzke wrote:
For 3-men my generator now takes ~10ms to build the tablebase, which at the moment I don't think I can get much faster
I don't know your HW but it seems much faster than mine. Mine takes (32 bit, Laptop) about 0,06 sec for KRK and KQK and 0,09 sec for KPK.
64 bit, i7 @ 2.9GHz.
Do you have anything in to avoid looping over all positions but knowing what positions to look for in each iteration ?
I do now. I don't remember exactly what speeds I was getting before that, but I think it was something in the order of 30ms (for a 3-men). However, there's nothing really special about it: it's just the list of "new" positions found in an iteration that was mentioned earlier.

The way I implemented it is as follows:

I have two arrays of bitboards, indexed by the king positions (462 elements, for 8-fold symmetry), called "new" and "old" (not the best names). Initially both are cleared to 0.

When I store a "new" position (whether newly won or newly lost) I mark the position of the third piece on the corresponding "new" bitboard. The "old" array is not used during the first (initialisation) step.

After each iteration, I flip the two arrays around: "old" now contains a record of which positions were recently added, "new" is now empty. Now when I iterate over the positions, I iterate over the king positions as normal, but I pull the position of the third piece from the "old" array. This way I completely skip over positions where there's no work to be done anyway.

You can apply the same idea to the position of the (other) king and the fourth piece (in a 4-piece ending), but my experience so far is that you do run into diminishing returns very quickly.

I currently don't, but probably should, record a bitbase of "won" positions, to speed up detection of whether a particular position is "lost". Currently I have to probe into the tablebase to verify this.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: On-demand tablebase generation

Post by hgm »

OK, it runs. As the generator does not treat individual positions, but 'king bundles' of 64 black King positions at once, it does not automatically count how many new positions were generated in each iteration. So I have to count them afterwards, which is of course extra overhead. So I equipped it with a --verbose mode that does such counting. This allows me to verify that the counts are correct. Without counting, the wall-clock time it takes to generate KBNK is 853 msec. (2.4GHz i3 laptop).

Code: Select all

hgm@hgm-xboard:~/EGT$ ./slow4 --verbose
start = 645380
 1.      312  4476548 ( 0.008)
 2.      296     1200 ( 0.008)
 3.     1792     3216 ( 0.010)
 4.     8064    15960 ( 0.009)
 5.    11184    39752 ( 0.023)
 6.    15144    45192 ( 0.016)
 7.    11860    48396 ( 0.012)
 8.    14588    40408 ( 0.012)
 9.    13936    42720 ( 0.020)
10.    24968    56240 ( 0.011)
11.    49324    95208 ( 0.015)
12.    68520   142084 ( 0.023)
13.    84852   189592 ( 0.022)
14.    62924   190296 ( 0.025)
15.    49912   139684 ( 0.026)
16.    55312   118004 ( 0.022)
17.    54704   114200 ( 0.016)
18.    53968   119780 ( 0.022)
19.    78824   127480 ( 0.024)
20.   142764   216552 ( 0.017)
21.   236304   350136 ( 0.022)
22.   338024   484868 ( 0.032)
23.   363400   576832 ( 0.037)
24.   473768   584456 ( 0.039)
25.   651668   729040 ( 0.040)
26.   997800   949808 ( 0.039)
27.  1353824  1278384 ( 0.044)
28.  1815448  1531500 ( 0.049)
29.  2027508  1434268 ( 0.040)
30.  1484940   859400 ( 0.041)
31.   553388   261004 ( 0.030)
32.    85424    33580 ( 0.017)
33.     2960     1104 ( 0.015)
34.        0        0 ( 0.007)
total time 1.543 sec, cpu time 1.270 sec
hgm@hgm-xboard:~/EGT$ ./slow4
start = 645380
 1. ( 0.008)
 2. ( 0.012)
 3. ( 0.013)
 4. ( 0.013)
 5. ( 0.011)
 6. ( 0.028)
 7. ( 0.021)
 8. ( 0.013)
 9. ( 0.029)
10. ( 0.019)
11. ( 0.013)
12. ( 0.023)
13. ( 0.026)
14. ( 0.026)
15. ( 0.026)
16. ( 0.018)
17. ( 0.023)
18. ( 0.024)
19. ( 0.024)
20. ( 0.018)
21. ( 0.028)
22. ( 0.030)
23. ( 0.037)
24. ( 0.039)
25. ( 0.038)
26. ( 0.041)
27. ( 0.043)
28. ( 0.047)
29. ( 0.041)
30. ( 0.037)
31. ( 0.033)
32. ( 0.024)
33. ( 0.010)
34. ( 0.014)
total time 0.853 sec, cpu time 0.650 sec
What is finally produced is a series of bitmaps, win / vs non-win for white. The white-to-move positions produce a bitbase, the black-to-move positions a bitmap for each DTM indicating the positons with that DTM. These could be accumulated into a bitbase, or converted to a DTM table.

This version still scans through all lost positions on every iteration, to hunt for mated-in-N positions. But an iteration that doesn't find much takes only about 7 ms, while the iterations that produce the bulk of the decided positions takes nearly 50ms. So there doesn't seem much to gain by making the scan smarter.
User avatar
hgm
Posts: 28396
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: On-demand tablebase generation

Post by hgm »

If I make a quick hack to turn it into a 3-men generator (only filling the part of the 4-men tables that would correspond to the 4th piece on a1 in a 4-men), disabling the looping over the 4th man, skipping the initialization of its attack set and commenting out the setting of its occupied bit, it takes 30ms to do KRK:

Code: Select all

hgm@hgm-xboard:~/EGT$ ./slow3 --verbose
start = 645380
 0.      216    75236 ( 0.000)
 1.      624     1512 ( 0.000)
 2.     1948     4676 ( 0.000)
 3.      648     3852 ( 0.000)
 4.     1584     1900 ( 0.000)
 5.     3768     4848 ( 0.000)
 6.     4728     8708 ( 0.000)
 7.     5444    11320 ( 0.000)
 8.    11448    17172 ( 0.000)
 9.    13672    20088 ( 0.001)
10.    15872    19016 ( 0.000)
11.    22788    20476 ( 0.001)
12.    28732    21480 ( 0.000)
13.    33516    17824 ( 0.001)
14.    36372    16136 ( 0.001)
15.    17284     5244 ( 0.001)
16.     3056      916 ( 0.001)
total time 0.029 sec, cpu time 0.030 sec
KGK takes twice as long (due to the much longer max-DTM, no doubt).

Code: Select all

hgm@hgm-xboard:~/EGT$ ./slow3 --verbose
start = 6453a0
 0.       36    44508 ( 0.000)
 1.       26       74 ( 0.000)
 2.       66      110 ( 0.000)
 3.       82      308 ( 0.000)
 4.      148      372 ( 0.000)
 5.      220      568 ( 0.000)
 6.      176      758 ( 0.000)
 7.      180      616 ( 0.000)
 8.      240      670 ( 0.000)
 9.      276      818 ( 0.000)
10.      436      984 ( 0.000)
11.      574     1426 ( 0.000)
12.      706     1664 ( 0.000)
13.      872     1988 ( 0.001)
14.     1240     2178 ( 0.000)
15.     1500     2854 ( 0.000)
16.     2472     3482 ( 0.000)
17.     3088     5476 ( 0.000)
18.     4576     6172 ( 0.000)
19.     6544     8252 ( 0.000)
20.     9402    11140 ( 0.001)
21.    13610    13778 ( 0.000)
22.    17218    17246 ( 0.000)
23.    18914    18068 ( 0.000)
24.    18204    17426 ( 0.001)
25.    16590    17096 ( 0.000)
26.    14014    15130 ( 0.000)
27.    12684    12876 ( 0.000)
28.    10208    10698 ( 0.000)
29.     8418     7644 ( 0.000)
30.     5126     4694 ( 0.000)
31.     3508     2506 ( 0.000)
32.     2194     1498 ( 0.000)
33.     1056      896 ( 0.000)
34.      322      246 ( 0.000)
35.      104       74 ( 0.000)
36.       10        4 ( 0.000)
37.       10        2 ( 0.000)
total time 0.077 sec, cpu time 0.060 sec
[Edit] Oops! That was the verbose version, and it still counts the entire 4-men table, also the unused parts! When I run without counting, both KRK and KGK take 6 msec.
User avatar
Evert
Posts: 2929
Joined: Sat Jan 22, 2011 12:42 am
Location: NL

Re: On-demand tablebase generation

Post by Evert »

Nice!

It's somewhat reassuring that my generator at the moment is running in the same ballpark - although it uses the symmetry, which does reduce the size of the problem considerably, so I guess I'm about a factor of 10 slower really. Not too unhappy with that at this point.

I have to make some changes to my verification code; right now it tests whether forward moves from a position lead to positions with a consistent score. It should really test both forward and reverse moves. Hopefully the generated tablebases still pass the test.

After that I'll be adding the option of doing pawns. I'm thinking of adding an arbitrary number of "fixed" pawns and one "mobile" pawn to start with, and assuming the game to be "won" when I mate the enemy king, or promote my pawn in a way that doesn't result in it being captured immediately, or succeed in capturing an enemy pawn without suffering a recapture immediately.