I wonder if it is still slower when you decide to probe tablebases only when the remaining depth is at least 20.bob wrote:The cache is much less significant when you factor in the 6-piece files that many are using. Suddenly the cache becomes almost useless due to the 2gb file sizes, and many files have to be broken into many 2gb chunks to work on 32 bit systems.michiguel wrote:Your 5ms figure assumes no cache. But if you have a cache that makes you read from disk only 5% of the time, a fast decompressing scheme that is effectively faster than reading uncompressed data, you will end up with an average figure of 0.3 ms (and we are not even talking about SSD). That means, if you probe in a node in which it is required to search deep enough to cost only 10k nodes, the effect on speed will be negligible. That means you can search "relatively" close to the leaves with no time-to-depth cost (in fact it should lower because of the pruning performed at nodes closer to the root). What is the depth you can reach with 10 k nodes? that is "approximately" the distance to the leaves you can _safely_ afford.bob wrote:Simple math. On my 8-core box, Crafty searches about 20M nodes per second up to 30M in the endgame. On a good disk, you can do a read every 5ms, or about 200 reads per second. Compare the speeds. During the time I can do a single read, I can search 150K nodes. That is huge. For every 200 I/O accesses, I could search another 30M nodes. The cost to check for probes is roughly zero since in the opening and middlegame, that branch gets predicted 100% correctly. But when you get down to the 12-16 pieces total, you begin to see EGTB probes. Each successful probe costs about 5ms if a disk access is required, actually quite a bit more since you read in fairly large compressed blocks, and then have to spend time uncompressing a block before the probe can be completed...Uri Blass wrote:I do not understand why probing in the search has to slow things down(assuming that you do not probe in every node in the search but only when the remaining depth is big enough so the time of searching the remaining depth is bigger than the time of probing the tablebases).bob wrote:There are several issues involved.rbarreira wrote:I haven't looked at this in detail, but it looks very weird to me that we have databases of perfect moves and we can't make use of them to improve play.
I realize that if the tablebases are used during search that would slow down the search, but as someone else said earlier it should be good to at least use them in the root (i.e. if current game position is in the tablebase, play the move suggested there).
(1) probing in the search slows things down. 10 years ago, this was not so noticable. Today, with an effective branching factor of way less than 2.0 in endgames, it can be a huge loss.
I think that the first test before playing games should be if you get the same depth faster in endgame analysis.
If you get the same depth faster but still do not earn rating points based on games then it is interesting(and maybe the reason can be that you are slightly slower in opening and middle game positions when you do not probe tablebases because checking that you do not need to probe tablebases after a capture cost time).
Either you rarely probe, which doesn't help much, or you probe a lot, which starts to cost multiple plies.
Of course, these numbers may not apply to Nalimov, but they do apply to the Gaviota TBs.
Miguel
My observations are exactly that, observations. Here is one example, using just one CPU to avoid any SMP issues. This is fine #70, not a particularly good EGTB position since it takes a while to reach a position with just 5 pieces left (I am only using 3/4/5 piece files for this test). Both find the correct move, Kb1, at depth 24, same amount of time. But as the search progresses...First, normal Crafty, no egtbs, then same depth using EGTBs:20x _slower_ with EGTBs.Code: Select all
37-> 0.16 4.27 1. Kb1 Kb7 2. Kc1 Kc7 3. Kd1 Kd7 4. Kc2 Kc7 5. Kd3 Kb7 6. Ke3 Kc7 7. Kf3 Kd7 8. Kg3 Ke7 9. Kh4 Kf7 10. Kg5 Kg7 11. Kxf5 Kf7 12. Kg5 Kg7 13. f5 Kf7 14. f6 Kf8 15. Kg4 Kg8 16. Kf4 Kf8 17. Kg5 Ke8 18. Kg4 Kf8 19. Kf5 37-> 3.20 4.27 1. Kb1 Kb7 2. Kc1 Kc7 3. Kd1 Kd7 4. Kc2 Kc7 5. Kd3 Kb7 6. Ke3 Kc7 7. Kf3 Kd7 8. Kg3 Ke7 9. Kh4 Kf7 10. Kg5 Kg7 11. Kxf5 Kf7 12. Kg5 Kg7 13. f5 Kf7 14. f6 Kf8 15. Kg4 Kg8 16. Kf4 Kf8 17. Kg5 Ke8 18. Kg4 Kf8 19. Kf5
30x slower there. Current Crafty is probing only up to 1/2 the normal iteration depth. For the 46 ply search, it only probes at plies <= 23. This is quite a ways from egtb probes at the root, and in fact:Code: Select all
46-> 2.06 7.11 1. Kb1 Kb7 2. Kc1 Kc7 3. Kd1 Kd7 4. Kc2 Kc7 5. Kd3 Kb7 6. Ke3 Kc7 7. Kf3 Kd7 8. Kg3 Ke7 9. Kh4 Kf7 10. Kg5 Kg7 11. Kxf5 Kf7 12. Kg5 Kg7 13. f5 Kf7 14. f6 Kf8 15. Kg4 Kg8 16. Kf4 Kf8 17. Kg5 Kf7 18. Kf5 Kf8 19. Ke6 Ke8 20. Kxd6 Kf7 21. Ke5 Kf8 22. Ke6 Kg8 23. d6 Kf8 46-> 1:00 16.73 1. Kb1 Kb7 2. Kc1 Kc7 3. Kd1 Kd7 4. Kc2 Kc7 5. Kd3 Kb7 6. Ke3 Kc7 7. Kf3 Kd7 8. Kg3 Ke7 9. Kh4 Kf7 10. Kg5 Kg7 11. Kxf5 Kf7 12. Ke4 Kf6 13. f5 Kg5 14. Kd3 Kh5 15. Kc4 Kg5 16. f6 Kg6 17. Kb5 Kf7 18. Kxa5 Ke8 19. f7+ Kf8 20. Kb6 Kg7 21. f8=Q+ Kg6 22. Qxd6+ Kf7 23. Qc7+ Ke8 24. Qc8+ Ke7 25. Qc5+ Kd7 26. Qc7+ Ke8
predicted=0 evals=1.6M 50move=0 EGTBprobes=53K hits=53K
Is the the search statistics. 53K probes. 30x slower. NPS?NPS dropped by 10x. That is why this is having a no-better result in cluster testing, whether I used very fast games or very slow games. The overall loss in NPS _really_ affects search depth when the middlegame branching factor is 2.0 or less. Enough so that the loss in depth offsets the gain in perfect knowledge...Code: Select all
time=2.06 mat=1 n=8711502 fh=94% nps=4.2M time=1:00 mat=1 n=21506138 fh=92% nps=356K
Even fast disks don't save the day. And when you add the 6-piece files, boom.
For fun, on the fastest 64gb SSD I have:6x slower there. If the time limit was exactly 1 second, without, Crafty hits 44 plies, with, 38, before time runs out. 6 plies lost. This is with all 3-4-5 piece files, again on fine 70 for comparison. I can test any position you want, but egtbs really have a negative impact even on a laptop with 4 gigs of RAM for buffering, and a 64 gig SSD for storage.Code: Select all
time=1.64 mat=1 n=8711502 fh=94% nps=5.3M time=12.99 mat=1 n=21506138 fh=92% nps=1.7M
....
I think that it is a bad decision to probe at iteration 46 based on ply<=23
because many of these positions are close to the leaves because of pruning so it makes the search slower.
Uri