CUDA - syzygy probing

dangi12012 · Post by **dangi12012** » Sun May 15, 2022 8:12 pm

So for the past many years it was easily possible to access a memory mapped file with the gpu. (MMU requests go to memory directly)
This means it is not a problem to map all of the 6man or 7man tables into CUDA memory space.

My question is if someone already did the work of translating below code:

syzygy wrote: ↑Sun May 15, 2022 7:31 pm https://github.com/syzygy1/tb/blob/master/src/probe.c

Because there is no reason to reinvent the wheel of probing from inside a device if somebody already started working on this...

dangi12012 · Post by **dangi12012** » Wed May 18, 2022 12:50 am

I take this as a no?
Last thing to say is that there is no performance difference.

When reading from a 8GB memory mapped file:
CPU: 45GB/s
GPU: 45GB/s

When reading from a 700GB memory mapped file:
CPU: 5.5GB/s
GPU: 5.5GB/s

Latency / writing are on par too. Both get perfect rw speeds for random IO.
That means mapping a tablebase into CUDA is not only possible - but a good idea.

If using an NVME raid array then this already surpasses DDR2 bandwidth - granted with a high latency of a few µs per IO request.

syzygy · Post by **syzygy** » Wed May 18, 2022 1:48 am

dangi12012 wrote: ↑Sun May 15, 2022 8:12 pm So for the past many years it was easily possible to access a memory mapped file with the gpu. (MMU requests go to memory directly)
This means it is not a problem to map all of the 6man or 7man tables into CUDA memory space.

My question is if someone already did the work of translating below code:
syzygy wrote: ↑Sun May 15, 2022 7:31 pm https://github.com/syzygy1/tb/blob/master/src/probe.c

Not that I am aware of.
(And I will definitely not be working on such a thing.)

CUDA - syzygy probing

CUDA - syzygy probing

Re: CUDA - syzygy probing

Re: CUDA - syzygy probing