Modeling and Optimization of Parallel Matrix-based Computations on GPU

White, Andrew

Metadata Field	Value	Language
dc.contributor.advisor	Lee, Soo-Young
dc.contributor.author	White, Andrew
dc.date.accessioned	2013-04-22T15:24:19Z
dc.date.available	2013-04-22T15:24:19Z
dc.date.issued	2013-04-22
dc.identifier.uri	http://hdl.handle.net/10415/3568
dc.description.abstract	As graphics processing units (GPUs) are continually being utilized as coprocessors, the demand for optimally utilizing them for various applications continues to grow. This work narrows the gap between programmers and minimum execution time for matrix-based computations on a GPU. To minimize execution time, computation and communication time must be considered. For computation, the placement of data in GPU memory significantly affects computation time and therefore is considered. Various matrix-based computation patterns are examined with respect to the layout of GPU memory. A computation pattern refers to the order in which each GPU thread computes a result. From examination of computation patterns, an optimized computation pattern, a pattern which reduces computation time, is derived. After the optimized computation pattern is derived, the access pattern to GPU memory, or order in which data is accessed, is considered. From the optimized access pattern, fine-tuning is performed to the GPU code such as minimization of index calculations and loop unrolling to further reduce computation time and resource allocation. After fine-tuning, input parameters which yield the minimum computation time for a matrix-based computation on the GPU are derived. Input parameters are defined as the dimensions of the grid and blocks assigned for execution on the GPU. Execution metrics are formulated, as functions of the input parameters, to represent the executional behavior of the GPU. The execution metrics are utilized to derive the optimal input parameters which are input parameters that yield the minimum computation time. The matrix-based computations considered are matrix-vector multiplication (Mv), matrix-matrix multiplication (MM), convolution, and the conjugate gradient method. A parallelization procedure is developed to minimize execution time by deriving the optimal placement of data, optimized computation pattern, optimized access pattern, and optimal input parameters. Each step of the procedure is developed through analysis of Mv and MM. The procedure includes an accurate communication model and estimates for CPU and GPU data transfers. With accurate estimates for communication and computation times, partitioning computation for CPU and GPU execution is performed to minimize execution time. Convolution and conjugate gradient are utilized to verify the validity of the procedure. Therefore, the overall goal of this work is to develop a parallelization procedure which minimizes execution time of matrix-based computations executing on the GPU.	en_US
dc.rights	EMBARGO_NOT_AUBURN	en_US
dc.subject	Electrical Engineering	en_US
dc.title	Modeling and Optimization of Parallel Matrix-based Computations on GPU	en_US
dc.type	dissertation	en_US
dc.embargo.length	NO_RESTRICTION	en_US
dc.embargo.status	NOT_EMBARGOED	en_US

Files in this item

Name:: Dissertation.pdf.txt
Size:: 248.4Kb

Name:: Dissertation.pdf
Size:: 1.097Mb

Show simple item record