So for the past many years it was easily possible to access a memory mapped file with the gpu. (MMU requests go to memory directly)
This means it is not a problem to map all of the 6man or 7man tables into CUDA memory space.
My question is if someone already did the work of translating below code:
I take this as a no?
Last thing to say is that there is no performance difference.
When reading from a 8GB memory mapped file:
CPU: 45GB/s
GPU: 45GB/s
When reading from a 700GB memory mapped file:
CPU: 5.5GB/s
GPU: 5.5GB/s
Latency / writing are on par too. Both get perfect rw speeds for random IO.
That means mapping a tablebase into CUDA is not only possible - but a good idea.
If using an NVME raid array then this already surpasses DDR2 bandwidth - granted with a high latency of a few µs per IO request.
dangi12012 wrote: ↑Sun May 15, 2022 8:12 pm
So for the past many years it was easily possible to access a memory mapped file with the gpu. (MMU requests go to memory directly)
This means it is not a problem to map all of the 6man or 7man tables into CUDA memory space.
My question is if someone already did the work of translating below code: