hgm wrote:Edmund wrote:For the 500+ cpus it would probably be best if they worked on different independent endgames in parallel. This way you sould be getting the best scaling.
The GPUs are more challenging, because they have to work on the same endgame. What do the others think of my proposal to use the additional processors to do more than just one leap per itereration? What benefit could we expect?
I don't understand what algorithm you would use. The bottle-neck in the algorithm is probing daughter positions to see if they are already decided. If you have to probe in every leave node, doing two levels gives you N*N probes where a single level gives you only N probes. And it is the probes that consume all the time.
Note: in the following by iteration I mean one cycle of the egtb generation process ie going through each permutation of the pieces on the board once
I think it is well applicable to the algorithm you describe on your website. The bottle-neck is the harddrive-I/O .. I suggest to use the data in the ram to solve as much as possible per iteration.
For example take the extreme example KQQNKBB with both bishops on the opposite color-square of the white king. If we now consider white K and N as fixed and white QQ and black KBB as dynamic (5 pieces .. 64^5=1G positions)
Most positions can be solved within one iteration, as the white king and knight don't need to be touched to reach the mate or the conversion to a won-subendgame.
So if the old algorithm would be:
1) get one slice from the HD to the RAM
2) for each btm white-won position make one white retro move
2.1) mark position as wtm white-won position
2.2) for each black retro move
2.2.1) if all black forward moves result in a position marked as wtm white-won, mark position as btm white-won
3) put the slice back to the HD
4) continue with the next slice
Now there would be two possible additions:
either you perform point 2 several times
or you could extend the 2.2.1 point:
2.2.1) for each black forward move
2.2.1.1) if position not marked as wtm white-won
2.2.1.1.1) for each white forward move
2.2.1.1.1.1) if position is btm white-won mark 2.2.1.1 as wtm white-won
2.2.1.1.1.2) else for each black forward move
2.2.1.1.1.2.1) for each position not wtm white-won
2.2.1.1.1.2.1.1) continue with 2.2.1.1
The problems would be a) getting this task done efficiently with multiple processors and b) I am not sure whehter it would be possible to get exact dtc scores or just win/not win information