Difference between revisions of "SRAD"

From Rodinia
Jump to: navigation, search
(New page: SRAD (Speckle Reducing Anisotropic Diffusion) is a diffusion method for ultrasonic and radar imaging applications based on partial differential equations (PDEs). It is used to remove local...)
 
 
(17 intermediate revisions by 5 users not shown)
Line 1: Line 1:
SRAD (Speckle Reducing Anisotropic Diffusion) is a diffusion method for ultrasonic and radar imaging applications based on
+
SRAD (Speckle Reducing Anisotropic Diffusion) [1] is a diffusion method for ultrasonic and radar imaging applications based on partial differential equations (PDEs). It is used to remove locally correlated noise, known as speckles, without destroying important image features. SRAD consists of several pieces of work: image extraction, continuous iterations over the image (preparation, reduction, statistics, computation 1 and computation 2) and image compression. The sequential dependency between all of these stages requires synchronization after each stage (because each stage operates on the entire image). SRAD is also uses as one of the initial stages in the Heart Wall application [2].
partial differential equations (PDEs). It is used to remove locally correlated noise, known as speckles, without destroying important image features.  
+
  
 +
Partitioning of the working set between caches and avoiding of cache trashing contribute to the performance. In CUDA version, each stage is a separate kernel (due to synchronization requirements) that operates on data already residing in GPU memory. The code features efficient GPU reduction of sums. In order to improve GPU performance data was transferred to GPU at the beginning of the code and then transferred back to CPU after all of the computation stages were completed in GPU. Some of the kernels use GPU shared memory for additional improvement in performance. Speedup achievable with CUDA version depends on the image size (up to the point where GPU saturates).
  
Our CUDA implementation of SRAD is composed of three kernels. In each grid upstate step, the first kernel performs a reduction by calculating a reference value using the mean and variance of a user specified image region which defines the
+
Papers:<br>
speckle. Using the reference value from the first kernel, the second kernel updates
+
[1] L. G. Szafaryn, K. Skadron, and J. J. Saucerman. "Experiences Accelerating MATLAB Systems Biology Applications." In Proceedings of the Workshop on Biomedicine in Computing: Systems, Architectures, and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2009. ([http://www.cs.virginia.edu/~lgs9a/publications/09_isca_bic_paper.pdf pdf])<br>
each data element using the values of its cardinal neighbors. The last kernel updates each data element of the
+
[2] Y. Yu, S. Acton, Speckle reducing anisotropic diffusion, IEEE Transactions on Image Processing 11(11)(2002) 1260-1270. ([http://www.cs.virginia.edu/~lgs9a/rodinia/heartwall/srad/paper_2.pdf pdf])<br>
result grid of the second kernel using the element’s north and west neighbors. The
+
 
application iterates over these three kernels, with more iterations producing an increasingly
+
<!--
smooth image.
+
Input: ([http://www.cs.virginia.edu/~lgs9a/rodinia/heartwall/srad/hw_srad_input.tar.gz tar.gz])<br>
 +
 
 +
OpenMP Version: ([http://www.cs.virginia.edu/~lgs9a/rodinia/heartwall/srad/hw_srad_openmp_code.tar.gz tar.gz])<br>
 +
 
 +
CUDA Version: ([http://www.cs.virginia.edu/~lgs9a/rodinia/heartwall/srad/hw_srad_cuda_code.tar.gz tar.gz])<br>
 +
 
 +
OpenCL Version: ([http://www.cs.virginia.edu/~lgs9a/rodinia/heartwall/srad/hw_srad_opencl_code.tar.gz tar.gz])<br>
 +
-->

Latest revision as of 18:40, 25 June 2015

SRAD (Speckle Reducing Anisotropic Diffusion) [1] is a diffusion method for ultrasonic and radar imaging applications based on partial differential equations (PDEs). It is used to remove locally correlated noise, known as speckles, without destroying important image features. SRAD consists of several pieces of work: image extraction, continuous iterations over the image (preparation, reduction, statistics, computation 1 and computation 2) and image compression. The sequential dependency between all of these stages requires synchronization after each stage (because each stage operates on the entire image). SRAD is also uses as one of the initial stages in the Heart Wall application [2].

Partitioning of the working set between caches and avoiding of cache trashing contribute to the performance. In CUDA version, each stage is a separate kernel (due to synchronization requirements) that operates on data already residing in GPU memory. The code features efficient GPU reduction of sums. In order to improve GPU performance data was transferred to GPU at the beginning of the code and then transferred back to CPU after all of the computation stages were completed in GPU. Some of the kernels use GPU shared memory for additional improvement in performance. Speedup achievable with CUDA version depends on the image size (up to the point where GPU saturates).

Papers:
[1] L. G. Szafaryn, K. Skadron, and J. J. Saucerman. "Experiences Accelerating MATLAB Systems Biology Applications." In Proceedings of the Workshop on Biomedicine in Computing: Systems, Architectures, and Circuits (BiC) 2009, in conjunction with the 36th IEEE/ACM International Symposium on Computer Architecture (ISCA), June 2009. (pdf)
[2] Y. Yu, S. Acton, Speckle reducing anisotropic diffusion, IEEE Transactions on Image Processing 11(11)(2002) 1260-1270. (pdf)