EGTB value

Discussion of chess software programming and technical issues.

Moderator: Ras

wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: EGTB value

Post by wgarvin »

I don't think it matters. bitbases might be a different issue since they are memory-resident and don't add in the msecs/read I/O penalty. But for any other type of "tablebases" I think this test will be pretty definitive, whether we like the answer or not. I might then try bitbases to see if there is any gain by using them in the search, and perhaps EGTBs at the first ply or 2 only to make progress...
Engines can use endgame databases for two purposes:
(1) Cut off branches in the search with a known GTV (win, loss, draw). This involves a LOT of probes and all you care about is win, loss or draw.
(2) Play out the rest of the game once a database position is reached over the board. This could use DTM/DTC/DTZ50 if you have them, but you only do a few dozen probes per board move so its okay for it to be pretty slow.

(2) is not really interesting because the game is already decided at that point. The interesting case is (1): steering the search towards known-won positions, and away from known-lost ones, and either towards or away from known draws depending on how we think we're doing.

Obviously, (1) only requires bitbases, not EGTBs such as Nalimov. If you use some kind of database with more information in it than just the GTV, all you're doing is wasting RAM and slowing down the engine. The Nalimov storage format also does not seem optimized for high-speed probing. You have to read entire compressed blocks from disk (which takes time even if its SSD), decompress most or all of the block, etc.

Since (1) is the main thing you need them for, bitbases can be designed for fast probing, fast decompression, etc. It probably has a lot less entropy than DTM/DTC/DTZ50 values. Clever compressed representations are probably possible.

Even (2) can probably be done quite well without the exact DTM or DTC or DTZ50 values. Darkthought had heuristics for all 4-man endings and didnt have any trouble winning those. In principle those could be extended to 5- and 6-man. The bitbase prevents it from making mistakes that don't conserve GTV, all you need the heuristic for is to make enough progress to win before a 50-move draw hits you. Of course the GTV values in the bitbases should be based on DTZ50 databases, so if an optimal opponent could delay the outcome past 50 moves, the bitbase should score it as a draw.

I have wondered for a long time whether bitbases could be stored in RAM in a compressed format that still supports random access. So they wouldn't take up as much RAM to cache them, and when reading pages of them from disk, you wouldn't have to decompress the pages before satisfying probes from them. If you were willing to be slightly lossy (i.e. return "don't know" for some small fraction positions, rather than a correct GTV) then that might be useful too.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: EGTB value

Post by bob »

wgarvin wrote:
I don't think it matters. bitbases might be a different issue since they are memory-resident and don't add in the msecs/read I/O penalty. But for any other type of "tablebases" I think this test will be pretty definitive, whether we like the answer or not. I might then try bitbases to see if there is any gain by using them in the search, and perhaps EGTBs at the first ply or 2 only to make progress...
Engines can use endgame databases for two purposes:
(1) Cut off branches in the search with a known GTV (win, loss, draw). This involves a LOT of probes and all you care about is win, loss or draw.
(2) Play out the rest of the game once a database position is reached over the board. This could use DTM/DTC/DTZ50 if you have them, but you only do a few dozen probes per board move so its okay for it to be pretty slow.

(2) is not really interesting because the game is already decided at that point. The interesting case is (1): steering the search towards known-won positions, and away from known-lost ones, and either towards or away from known draws depending on how we think we're doing.

Obviously, (1) only requires bitbases, not EGTBs such as Nalimov. If you use some kind of database with more information in it than just the GTV, all you're doing is wasting RAM and slowing down the engine. The Nalimov storage format also does not seem optimized for high-speed probing. You have to read entire compressed blocks from disk (which takes time even if its SSD), decompress most or all of the block, etc.

Since (1) is the main thing you need them for, bitbases can be designed for fast probing, fast decompression, etc. It probably has a lot less entropy than DTM/DTC/DTZ50 values. Clever compressed representations are probably possible.

Even (2) can probably be done quite well without the exact DTM or DTC or DTZ50 values. Darkthought had heuristics for all 4-man endings and didnt have any trouble winning those. In principle those could be extended to 5- and 6-man. The bitbase prevents it from making mistakes that don't conserve GTV, all you need the heuristic for is to make enough progress to win before a 50-move draw hits you. Of course the GTV values in the bitbases should be based on DTZ50 databases, so if an optimal opponent could delay the outcome past 50 moves, the bitbase should score it as a draw.

I have wondered for a long time whether bitbases could be stored in RAM in a compressed format that still supports random access. So they wouldn't take up as much RAM to cache them, and when reading pages of them from disk, you wouldn't have to decompress the pages before satisfying probes from them. If you were willing to be slightly lossy (i.e. return "don't know" for some small fraction positions, rather than a correct GTV) then that might be useful too.
SInce we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
Dann Corbit
Posts: 12803
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: EGTB value

Post by Dann Corbit »

bob wrote:
wgarvin wrote:
I don't think it matters. bitbases might be a different issue since they are memory-resident and don't add in the msecs/read I/O penalty. But for any other type of "tablebases" I think this test will be pretty definitive, whether we like the answer or not. I might then try bitbases to see if there is any gain by using them in the search, and perhaps EGTBs at the first ply or 2 only to make progress...
Engines can use endgame databases for two purposes:
(1) Cut off branches in the search with a known GTV (win, loss, draw). This involves a LOT of probes and all you care about is win, loss or draw.
(2) Play out the rest of the game once a database position is reached over the board. This could use DTM/DTC/DTZ50 if you have them, but you only do a few dozen probes per board move so its okay for it to be pretty slow.

(2) is not really interesting because the game is already decided at that point. The interesting case is (1): steering the search towards known-won positions, and away from known-lost ones, and either towards or away from known draws depending on how we think we're doing.

Obviously, (1) only requires bitbases, not EGTBs such as Nalimov. If you use some kind of database with more information in it than just the GTV, all you're doing is wasting RAM and slowing down the engine. The Nalimov storage format also does not seem optimized for high-speed probing. You have to read entire compressed blocks from disk (which takes time even if its SSD), decompress most or all of the block, etc.

Since (1) is the main thing you need them for, bitbases can be designed for fast probing, fast decompression, etc. It probably has a lot less entropy than DTM/DTC/DTZ50 values. Clever compressed representations are probably possible.

Even (2) can probably be done quite well without the exact DTM or DTC or DTZ50 values. Darkthought had heuristics for all 4-man endings and didnt have any trouble winning those. In principle those could be extended to 5- and 6-man. The bitbase prevents it from making mistakes that don't conserve GTV, all you need the heuristic for is to make enough progress to win before a 50-move draw hits you. Of course the GTV values in the bitbases should be based on DTZ50 databases, so if an optimal opponent could delay the outcome past 50 moves, the bitbase should score it as a draw.

I have wondered for a long time whether bitbases could be stored in RAM in a compressed format that still supports random access. So they wouldn't take up as much RAM to cache them, and when reading pages of them from disk, you wouldn't have to decompress the pages before satisfying probes from them. If you were willing to be slightly lossy (i.e. return "don't know" for some small fraction positions, rather than a correct GTV) then that might be useful too.
SInce we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
Miguel's EGTB files have several modes of operation:

tb_probe_hard
tb_probe_soft
tb_probe_WDL_hard
tb_probe_WDL_soft

The bottom two modes are bitbase modes.
The ones with 'hard' in the name will touch the disk for the answer if it is not in memory. The ones with 'soft' in the name only read from the cache.

They offer a number of different compression algorithms, and the license format is compatible with just about any sort of project.
Dann Corbit
Posts: 12803
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: EGTB value

Post by Dann Corbit »

wgarvin wrote:
I don't think it matters. bitbases might be a different issue since they are memory-resident and don't add in the msecs/read I/O penalty. But for any other type of "tablebases" I think this test will be pretty definitive, whether we like the answer or not. I might then try bitbases to see if there is any gain by using them in the search, and perhaps EGTBs at the first ply or 2 only to make progress...
Engines can use endgame databases for two purposes:
(1) Cut off branches in the search with a known GTV (win, loss, draw). This involves a LOT of probes and all you care about is win, loss or draw.
(2) Play out the rest of the game once a database position is reached over the board. This could use DTM/DTC/DTZ50 if you have them, but you only do a few dozen probes per board move so its okay for it to be pretty slow.
(3) Produce analysis for perfect play in analyze mode. For most end users, this is the most important usage.

The effects of probing can vary greatly as a function of how the files are used. For instance, the cost is negligible if the strategy is to probe only pv nodes. We could consider this as "insurance" mode.
(2) is not really interesting because the game is already decided at that point. The interesting case is (1): steering the search towards known-won positions, and away from known-lost ones, and either towards or away from known draws depending on how we think we're doing.

Obviously, (1) only requires bitbases, not EGTBs such as Nalimov. If you use some kind of database with more information in it than just the GTV, all you're doing is wasting RAM and slowing down the engine. The Nalimov storage format also does not seem optimized for high-speed probing. You have to read entire compressed blocks from disk (which takes time even if its SSD), decompress most or all of the block, etc.

Since (1) is the main thing you need them for, bitbases can be designed for fast probing, fast decompression, etc. It probably has a lot less entropy than DTM/DTC/DTZ50 values. Clever compressed representations are probably possible.

Even (2) can probably be done quite well without the exact DTM or DTC or DTZ50 values. Darkthought had heuristics for all 4-man endings and didnt have any trouble winning those. In principle those could be extended to 5- and 6-man. The bitbase prevents it from making mistakes that don't conserve GTV, all you need the heuristic for is to make enough progress to win before a 50-move draw hits you. Of course the GTV values in the bitbases should be based on DTZ50 databases, so if an optimal opponent could delay the outcome past 50 moves, the bitbase should score it as a draw.

I have wondered for a long time whether bitbases could be stored in RAM in a compressed format that still supports random access. So they wouldn't take up as much RAM to cache them, and when reading pages of them from disk, you wouldn't have to decompress the pages before satisfying probes from them. If you were willing to be slightly lossy (i.e. return "don't know" for some small fraction positions, rather than a correct GTV) then that might be useful too.
bob
Posts: 20943
Joined: Mon Feb 27, 2006 7:30 pm
Location: Birmingham, AL

Re: EGTB value

Post by bob »

Dann Corbit wrote:
bob wrote:
wgarvin wrote:
I don't think it matters. bitbases might be a different issue since they are memory-resident and don't add in the msecs/read I/O penalty. But for any other type of "tablebases" I think this test will be pretty definitive, whether we like the answer or not. I might then try bitbases to see if there is any gain by using them in the search, and perhaps EGTBs at the first ply or 2 only to make progress...
Engines can use endgame databases for two purposes:
(1) Cut off branches in the search with a known GTV (win, loss, draw). This involves a LOT of probes and all you care about is win, loss or draw.
(2) Play out the rest of the game once a database position is reached over the board. This could use DTM/DTC/DTZ50 if you have them, but you only do a few dozen probes per board move so its okay for it to be pretty slow.

(2) is not really interesting because the game is already decided at that point. The interesting case is (1): steering the search towards known-won positions, and away from known-lost ones, and either towards or away from known draws depending on how we think we're doing.

Obviously, (1) only requires bitbases, not EGTBs such as Nalimov. If you use some kind of database with more information in it than just the GTV, all you're doing is wasting RAM and slowing down the engine. The Nalimov storage format also does not seem optimized for high-speed probing. You have to read entire compressed blocks from disk (which takes time even if its SSD), decompress most or all of the block, etc.

Since (1) is the main thing you need them for, bitbases can be designed for fast probing, fast decompression, etc. It probably has a lot less entropy than DTM/DTC/DTZ50 values. Clever compressed representations are probably possible.

Even (2) can probably be done quite well without the exact DTM or DTC or DTZ50 values. Darkthought had heuristics for all 4-man endings and didnt have any trouble winning those. In principle those could be extended to 5- and 6-man. The bitbase prevents it from making mistakes that don't conserve GTV, all you need the heuristic for is to make enough progress to win before a 50-move draw hits you. Of course the GTV values in the bitbases should be based on DTZ50 databases, so if an optimal opponent could delay the outcome past 50 moves, the bitbase should score it as a draw.

I have wondered for a long time whether bitbases could be stored in RAM in a compressed format that still supports random access. So they wouldn't take up as much RAM to cache them, and when reading pages of them from disk, you wouldn't have to decompress the pages before satisfying probes from them. If you were willing to be slightly lossy (i.e. return "don't know" for some small fraction positions, rather than a correct GTV) then that might be useful too.
SInce we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
Miguel's EGTB files have several modes of operation:

tb_probe_hard
tb_probe_soft
tb_probe_WDL_hard
tb_probe_WDL_soft

The bottom two modes are bitbase modes.
The ones with 'hard' in the name will touch the disk for the answer if it is not in memory. The ones with 'soft' in the name only read from the cache.

They offer a number of different compression algorithms, and the license format is compatible with just about any sort of project.
The hard/soft idea is interesting. I am already seeing a difference in Elo depending on how aggressively I probe. I tuned this years ago for 15K SCSI drives, and eventually used raid-0 for major bandwidth. It appears that reducing the probe depth by 1/2 is a positive step, but I am waiting on games. I currently am running 1m+1s games, using no egtb, egtb probed at iteration_depth / 2 or less, and egtb probed at iteration_depth or less (this is default behaviour). Once those finish, I might try something less than 1/2, and maybe 3/4 to see where the optimal solution lies. Then I am going to run even longer games and see if that changes things or not.
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: EGTB value

Post by wgarvin »

bob wrote:Since we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
You could start with Nalimov 5-men and derive bitbases from them, I don't think there's any problems with the 50-move rule in those. If I were doing it myself I'd want to write my own DTZ50 generator first (which is probably why I've been thinking about it for at least 10 years and still havent done it...) Once you have the bitbases, the interesting research project would be to figure out how to represent them more cheaply.

By "less entropy", I meant that the DTM / DTC / DTZ50 databases consist of entries like "Win in 21", "Win in 17", "Win in 41", "Win in 20", "Win in 19", "Draw", "Draw", "Lose in 4", "Win in 36", "Draw" ....

And the matching bitbase would contain: W,W,W,W,W,D,D,L,W,D, ...
So its probably easier to compress.

Especially if you can get that 1.5 bits down to 1 bit for some of them: "Win" and "Everything else" then use some smaller secondary lookup to figure out the everything else. Secondary lookups on disk are too slow, but if the whole thing fits in RAM it would be fine.

Darkthought had 4-man bitbases in about 15 MB, *uncompressed*. Of course a full set of 5-man bitbases would be much bigger, but if you had a good compressed representation supporting random access, setting aside a big chunk of ram to cache them (like 500+ MB) might be reasonable.
Dann Corbit
Posts: 12803
Joined: Wed Mar 08, 2006 8:57 pm
Location: Redmond, WA USA

Re: EGTB value

Post by Dann Corbit »

wgarvin wrote:
bob wrote:Since we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
You could start with Nalimov 5-men and derive bitbases from them, I don't think there's any problems with the 50-move rule in those. If I were doing it myself I'd want to write my own DTZ50 generator first (which is probably why I've been thinking about it for at least 10 years and still havent done it...) Once you have the bitbases, the interesting research project would be to figure out how to represent them more cheaply.

By "less entropy", I meant that the DTM / DTC / DTZ50 databases consist of entries like "Win in 21", "Win in 17", "Win in 41", "Win in 20", "Win in 19", "Draw", "Draw", "Lose in 4", "Win in 36", "Draw" ....

And the matching bitbase would contain: W,W,W,W,W,D,D,L,W,D, ...
So its probably easier to compress.

Especially if you can get that 1.5 bits down to 1 bit for some of them: "Win" and "Everything else" then use some smaller secondary lookup to figure out the everything else. Secondary lookups on disk are too slow, but if the whole thing fits in RAM it would be fine.

Darkthought had 4-man bitbases in about 15 MB, *uncompressed*. Of course a full set of 5-man bitbases would be much bigger, but if you had a good compressed representation supporting random access, setting aside a big chunk of ram to cache them (like 500+ MB) might be reasonable.
Here is the size of the scorpio bitbase files:

Code: Select all

 Directory of c:\chess\winboard\egbb2
02/22/2009  09:23 AM           384,933 kbbbk.w.cmp
02/21/2009  09:41 PM            35,729 kbbk.b.cmp
02/21/2009  09:41 PM            39,471 kbbk.w.cmp
02/22/2009  03:36 AM           520,656 kbbkb.b.cmp
02/22/2009  03:36 AM           725,329 kbbkn.w.cmp
02/22/2009  03:39 AM         8,189,490 kbbkp.w.cmp
02/22/2009  03:29 AM         1,101,230 kbbkq.b.cmp
02/22/2009  03:33 AM         1,051,765 kbbkr.b.cmp
02/22/2009  09:24 AM            32,027 kbbnk.w.cmp
02/22/2009  09:26 AM           801,348 kbbpk.w.cmp
01/22/2009  11:45 AM               391 kbk.b.cmp
01/22/2009  11:45 AM               391 kbk.w.cmp
02/21/2009  09:41 PM             1,254 kbkb.b.cmp
02/21/2009  09:41 PM             1,254 kbkb.w.cmp
02/21/2009  09:41 PM             1,141 kbkn.b.cmp
02/21/2009  09:41 PM             1,140 kbkn.w.cmp
02/21/2009  09:41 PM           180,964 kbkp.b.cmp
02/21/2009  09:41 PM            65,217 kbkp.w.cmp
02/21/2009  09:41 PM            37,149 kbnk.b.cmp
02/21/2009  09:41 PM             6,566 kbnk.w.cmp
02/22/2009  04:02 AM           718,773 kbnkb.b.cmp
02/22/2009  04:08 AM         1,695,359 kbnkn.b.cmp
02/22/2009  04:11 AM         5,699,046 kbnkp.w.cmp
02/22/2009  03:51 AM         1,238,469 kbnkq.b.cmp
02/22/2009  03:56 AM         2,416,607 kbnkr.b.cmp
02/22/2009  09:32 AM            26,770 kbnnk.w.cmp
02/22/2009  09:35 AM           150,330 kbnpk.w.cmp
02/21/2009  09:41 PM            95,928 kbpk.b.cmp
02/21/2009  09:41 PM            38,751 kbpk.w.cmp
02/22/2009  05:08 AM         8,137,342 kbpkb.b.cmp
02/22/2009  05:22 AM        12,537,057 kbpkn.b.cmp
02/22/2009  05:25 AM         7,929,525 kbpkp.w.cmp
02/22/2009  04:38 AM         3,521,817 kbpkq.b.cmp
02/22/2009  04:50 AM        15,785,475 kbpkr.w.cmp
02/22/2009  09:48 AM           150,539 kbppk.w.cmp
01/22/2009  11:45 AM               391 knk.b.cmp
01/22/2009  11:45 AM               391 knk.w.cmp
02/21/2009  09:41 PM             1,139 knkn.b.cmp
02/21/2009  09:41 PM             1,139 knkn.w.cmp
02/21/2009  09:42 PM           178,693 knkp.b.cmp
02/21/2009  09:42 PM            84,244 knkp.w.cmp
02/21/2009  09:41 PM               815 knnk.b.cmp
02/21/2009  09:41 PM               894 knnk.w.cmp
02/22/2009  05:43 AM            26,895 knnkb.b.cmp
02/22/2009  05:46 AM            28,929 knnkn.b.cmp
02/22/2009  05:48 AM         6,671,654 knnkp.w.cmp
02/22/2009  05:37 AM         1,426,066 knnkq.b.cmp
02/22/2009  05:38 AM           169,641 knnkr.w.cmp
02/22/2009  09:49 AM            77,344 knnnk.w.cmp
02/22/2009  09:51 AM           663,535 knnpk.w.cmp
02/21/2009  09:41 PM            99,578 knpk.b.cmp
02/21/2009  09:41 PM            39,746 knpk.w.cmp
02/22/2009  06:38 AM         6,339,448 knpkb.b.cmp
02/22/2009  06:53 AM         7,037,393 knpkn.b.cmp
02/22/2009  06:55 AM         9,107,016 knpkp.w.cmp
02/22/2009  06:08 AM         3,517,299 knpkq.b.cmp
02/22/2009  06:21 AM        15,551,780 knpkr.w.cmp
02/22/2009  09:58 AM            77,735 knppk.w.cmp
01/22/2009  11:45 AM             3,913 kpk.b.cmp
01/22/2009  11:45 AM             3,250 kpk.w.cmp
02/21/2009  09:42 PM           328,240 kpkp.b.cmp
02/21/2009  09:42 PM           315,654 kpkp.w.cmp
02/21/2009  09:41 PM            42,767 kppk.b.cmp
02/21/2009  09:41 PM            14,179 kppk.w.cmp
02/22/2009  07:22 AM         5,172,532 kppkb.b.cmp
02/22/2009  07:27 AM         5,249,446 kppkn.b.cmp
02/22/2009  07:28 AM         5,190,570 kppkp.w.cmp
02/22/2009  07:11 AM           573,398 kppkq.b.cmp
02/22/2009  07:17 AM         3,975,463 kppkr.b.cmp
02/22/2009  10:00 AM            37,031 kpppk.w.cmp
02/22/2009  08:04 AM            22,779 kqbbk.w.cmp
02/21/2009  09:40 PM            16,642 kqbk.b.cmp
01/22/2009  11:45 AM             1,132 kqbk.w.cmp
02/21/2009  10:38 PM            99,087 kqbkb.w.cmp
02/21/2009  10:43 PM           256,202 kqbkn.w.cmp
02/21/2009  10:49 PM           168,557 kqbkp.w.cmp
02/21/2009  10:32 PM         3,762,695 kqbkq.w.cmp
02/21/2009  10:33 PM           193,800 kqbkr.w.cmp
02/22/2009  08:07 AM            45,179 kqbnk.w.cmp
02/22/2009  08:11 AM           135,939 kqbpk.w.cmp
01/22/2009  11:45 AM               735 kqk.b.cmp
01/22/2009  11:45 AM               391 kqk.w.cmp
02/21/2009  09:41 PM           112,295 kqkb.b.cmp
02/21/2009  09:41 PM             7,192 kqkb.w.cmp
02/21/2009  09:41 PM            73,325 kqkn.b.cmp
02/21/2009  09:41 PM            10,989 kqkn.w.cmp
02/21/2009  09:41 PM           220,343 kqkp.b.cmp
02/21/2009  09:41 PM             6,629 kqkp.w.cmp
02/21/2009  09:41 PM           122,984 kqkq.b.cmp
02/21/2009  09:41 PM           122,984 kqkq.w.cmp
02/21/2009  09:41 PM           118,139 kqkr.b.cmp
02/21/2009  09:41 PM            12,975 kqkr.w.cmp
02/21/2009  09:40 PM            11,728 kqnk.b.cmp
02/21/2009  09:40 PM             1,132 kqnk.w.cmp
02/21/2009  11:11 PM            95,454 kqnkb.w.cmp
02/21/2009  11:16 PM           129,119 kqnkn.w.cmp
02/21/2009  11:22 PM           212,520 kqnkp.w.cmp
02/21/2009  11:05 PM         3,452,925 kqnkq.w.cmp
02/21/2009  11:06 PM           169,403 kqnkr.w.cmp
02/22/2009  08:21 AM            22,779 kqnnk.w.cmp
02/22/2009  08:24 AM           135,948 kqnpk.w.cmp
02/21/2009  09:40 PM            46,589 kqpk.b.cmp
02/21/2009  09:40 PM             2,642 kqpk.w.cmp
02/22/2009  12:04 AM           397,642 kqpkb.w.cmp
02/22/2009  12:19 AM           848,693 kqpkn.w.cmp
02/22/2009  12:32 AM           120,737 kqpkp.w.cmp
02/21/2009  11:36 PM        17,639,749 kqpkq.w.cmp
02/21/2009  11:50 PM           732,516 kqpkr.w.cmp
02/22/2009  08:34 AM            52,620 kqppk.w.cmp
02/22/2009  07:36 AM            22,778 kqqbk.w.cmp
01/22/2009  11:45 AM             9,488 kqqk.b.cmp
01/22/2009  11:45 AM               758 kqqk.w.cmp
02/21/2009  09:47 PM            22,778 kqqkb.w.cmp
02/21/2009  09:49 PM            22,806 kqqkn.w.cmp
02/21/2009  09:52 PM            76,013 kqqkp.w.cmp
02/21/2009  09:42 PM           267,920 kqqkq.w.cmp
02/21/2009  09:45 PM            30,151 kqqkr.w.cmp
02/22/2009  07:37 AM            22,778 kqqnk.w.cmp
02/22/2009  07:40 AM            68,163 kqqpk.w.cmp
02/22/2009  07:33 AM             7,852 kqqqk.w.cmp
02/22/2009  07:34 AM            22,779 kqqrk.w.cmp
02/22/2009  07:47 AM            45,175 kqrbk.w.cmp
01/22/2009  11:45 AM             7,652 kqrk.b.cmp
01/22/2009  11:45 AM             1,132 kqrk.w.cmp
02/21/2009  10:07 PM            68,768 kqrkb.w.cmp
02/21/2009  10:12 PM           102,077 kqrkn.w.cmp
02/21/2009  10:18 PM           148,326 kqrkp.w.cmp
02/21/2009  09:58 PM         1,190,048 kqrkq.w.cmp
02/21/2009  10:02 PM            99,982 kqrkr.w.cmp
02/22/2009  07:50 AM            45,175 kqrnk.w.cmp
02/22/2009  07:55 AM           135,943 kqrpk.w.cmp
02/22/2009  07:45 AM            22,780 kqrrk.w.cmp
02/22/2009  08:48 AM            24,746 krbbk.w.cmp
02/21/2009  09:40 PM            13,116 krbk.b.cmp
02/21/2009  09:40 PM             1,132 krbk.w.cmp
02/22/2009  01:10 AM           571,970 krbkb.w.cmp
02/22/2009  01:16 AM           602,656 krbkn.w.cmp
02/22/2009  01:22 AM           892,217 krbkp.w.cmp
02/22/2009  01:03 AM         7,480,706 krbkq.w.cmp
02/22/2009  01:09 AM         2,159,834 krbkr.b.cmp
02/22/2009  08:50 AM            45,192 krbnk.w.cmp
02/22/2009  08:55 AM           146,327 krbpk.w.cmp
01/22/2009  11:45 AM               688 krk.b.cmp
01/22/2009  11:45 AM               391 krk.w.cmp
02/21/2009  09:41 PM            32,659 krkb.b.cmp
02/21/2009  09:41 PM           112,569 krkb.w.cmp
02/21/2009  09:41 PM            85,189 krkn.b.cmp
02/21/2009  09:41 PM           126,853 krkn.w.cmp
02/21/2009  09:41 PM           261,988 krkp.b.cmp
02/21/2009  09:41 PM            52,710 krkp.w.cmp
02/21/2009  09:41 PM            82,599 krkr.b.cmp
02/21/2009  09:41 PM            82,597 krkr.w.cmp
02/21/2009  09:41 PM             9,394 krnk.b.cmp
02/21/2009  09:40 PM             1,132 krnk.w.cmp
02/22/2009  01:46 AM           911,758 krnkb.w.cmp
02/22/2009  01:52 AM           312,715 krnkn.w.cmp
02/22/2009  01:58 AM         1,251,317 krnkp.w.cmp
02/22/2009  01:40 AM         6,039,740 krnkq.b.cmp
02/22/2009  01:45 AM         2,382,893 krnkr.b.cmp
02/22/2009  09:05 AM            24,484 krnnk.w.cmp
02/22/2009  09:09 AM           145,897 krnpk.w.cmp
02/21/2009  09:41 PM            41,821 krpk.b.cmp
02/21/2009  09:41 PM             2,642 krpk.w.cmp
02/22/2009  02:44 AM         4,951,103 krpkb.w.cmp
02/22/2009  03:01 AM         3,619,870 krpkn.w.cmp
02/22/2009  03:15 AM           904,264 krpkp.w.cmp
02/22/2009  02:25 AM         8,078,279 krpkq.b.cmp
02/22/2009  02:28 AM        11,752,631 krpkr.w.cmp
02/22/2009  09:19 AM            52,630 krppk.w.cmp
02/22/2009  08:39 AM            22,783 krrbk.w.cmp
02/21/2009  09:40 PM             2,008 krrk.b.cmp
02/21/2009  09:40 PM               758 krrk.w.cmp
02/22/2009  12:47 AM           259,961 krrkb.w.cmp
02/22/2009  12:50 AM           120,956 krrkn.w.cmp
02/22/2009  12:53 AM            76,760 krrkp.w.cmp
02/22/2009  12:42 AM         3,123,419 krrkq.w.cmp
02/22/2009  12:44 AM           158,063 krrkr.w.cmp
02/22/2009  08:41 AM            22,778 krrnk.w.cmp
02/22/2009  08:43 AM            73,031 krrpk.w.cmp
02/22/2009  08:38 AM             7,852 krrrk.w.cmp
             180 File(s)    234,479,030 bytes
[/code]
User avatar
michiguel
Posts: 6401
Joined: Thu Mar 09, 2006 8:30 pm
Location: Chicago, Illinois, USA

Re: EGTB value

Post by michiguel »

wgarvin wrote:
bob wrote:Since we probe Nalimov compressed files randomly, on disks, I don't see why the same idea would not work for bitbases, and thought that was already done. Only issue is the computational cost of the decompression. If you take a typical EGTB that has only 1 byte values, you should be able to compress a 1.5 bit value (w/l/d) and get maybe 5 of those in a byte. So doing some ugly rounding, our 7.5 gigs of 3-4-5 piece Nalimov tables turns into maybe 1.5g of bitbase data? In theory this might compress better, I am not sure. 1gb is certainly doable. And I was thinking that current bitbase implementations were already doing this, but I am not sure.

Certainly one can easily derive a full set of bitbase values from the Nalimov files, rather than having to generate them...
You could start with Nalimov 5-men and derive bitbases from them, I don't think there's any problems with the 50-move rule in those. If I were doing it myself I'd want to write my own DTZ50 generator first (which is probably why I've been thinking about it for at least 10 years and still havent done it...) Once you have the bitbases, the interesting research project would be to figure out how to represent them more cheaply.

By "less entropy", I meant that the DTM / DTC / DTZ50 databases consist of entries like "Win in 21", "Win in 17", "Win in 41", "Win in 20", "Win in 19", "Draw", "Draw", "Lose in 4", "Win in 36", "Draw" ....

And the matching bitbase would contain: W,W,W,W,W,D,D,L,W,D, ...
So its probably easier to compress.

Especially if you can get that 1.5 bits down to 1 bit for some of them: "Win" and "Everything else" then use some smaller secondary lookup to figure out the everything else. Secondary lookups on disk are too slow, but if the whole thing fits in RAM it would be fine.

Darkthought had 4-man bitbases in about 15 MB, *uncompressed*. Of course a full set of 5-man bitbases would be much bigger, but if you had a good compressed representation supporting random access, setting aside a big chunk of ram to cache them (like 500+ MB) might be reasonable.
You do not want to compress bitbases in memory because it takes time to decompress a block just to pick one entry. And you do not want to have them all in memory, because that memory is more useful to be used as hashtable. IMHO, the best solution is to have them uncompressed in an efficient cache. It is also a waste to have bitbases and tablebases. But if you cache the EGTB info as bitbase information, you accomplished the goals of both formats. In addition, having a common interface, allows you to solve some positions with just heuristics, saving huge amount of time of HD access. This is the philosophy of the Gaviota format.

Miguel
wgarvin
Posts: 838
Joined: Thu Jul 05, 2007 5:03 pm
Location: British Columbia, Canada

Re: EGTB value

Post by wgarvin »

michiguel wrote:You do not want to compress bitbases in memory because it takes time to decompress a block just to pick one entry. And you do not want to have them all in memory, because that memory is more useful to be used as hashtable. IMHO, the best solution is to have them uncompressed in an efficient cache. It is also a waste to have bitbases and tablebases. But if you cache the EGTB info as bitbase information, you accomplished the goals of both formats. In addition, having a common interface, allows you to solve some positions with just heuristics, saving huge amount of time of HD access. This is the philosophy of the Gaviota format.
I suppose. By "compressed representation supporting random access", I meant to exclude block-based compression schemes. There are ways of packing sparse arrays of values so that they still support efficient random access; maybe something like that could work with bitbases, if you can make a cheap enough predictor that turns most of the bits to zero. Bloom filters might also be interesting (but I guess large ones would be expensive to probe). I don't think anyone has really tried hard to compress bitbases using anything but block based LZ schemes. It might turn out that compressing whole blocks is the only way to make them decently small. But even then, if you could get away with much smaller block sizes without ruining the compression ratio, then you could load the bitbases in memory in their compressed form, use a smaller cache for uncompressed blocks, and still afford to access them even out near the tips of the search. There's probably some big reason why it wouldn't work, but I can't think of one at the moment.
Dirt
Posts: 2851
Joined: Wed Mar 08, 2006 10:01 pm
Location: Irvine, CA, USA

Re: EGTB value

Post by Dirt »

bob wrote:...my intent is going to be to remove all the EGTB code since it is not providing any Elo boost, unless I happen to find some setting/limit that does improve results.
I just don't believe that only probing at root can do anything but help. Maybe you should start at that end.