Documents
Poster
Fast and Parallel Computation of the Discrete Periodic Radon Transform on GPUs, multi-core CPUs and FPGAs
- Citation Author(s):
- Submitted by:
- Cesar Carranza
- Last updated:
- 4 October 2018 - 9:39am
- Document Type:
- Poster
- Document Year:
- 2018
- Event:
- Presenters:
- Cesar Carranza
- Paper Code:
- 2560
- Categories:
- Log in to post comments
The Discrete Periodic Radon Transform (DPRT) has many important applications in reconstructing images from their projections and has recently been used in fast and scalable architectures for computing 2D convolutions. Unfortunately, the direct computation of the DPRT involves O(N^3) additions and memory accesses that can be very costly in single-core architectures.
The current paper presents new and efficient algorithms for computing the DPRT and its inverse on multi-core CPUs and GPUs. The results are compared against specialized hardware implementations (FPGAs/ASICs). The results provide significant evidence of the success of the new algorithms. On an 8-core CPU (Intel Xeon), with support for two threads per core, FastDirDPRT and FastDirInvDPRT achieve a speedup of approximately 10x (up to 12:83x) over the single-core CPU implementation. On a 2048-core GPU (GTX980), FastRayDPRT and FastRayInvDPRT achieve speedups in the range of 526 (for 127x127) to 873 (for 1021x1021), which approximate ideal speedups of what can be achieved. The DPRT can be computed exactly and in real-time (30 frames per second) for 1471x1471 images using FastRayDPRT on the GPU. Furthermore, the GPU algorithms approximate the performance of an efficient FPGA implementation using 2N parallel cores at 100MHz.