Use this URL to cite or link to this record in EThOS: https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.739683
Title: Concurrency and data locality for sparse linear algebra on modern processors
Author: Picciau, Andrea
ISNI:       0000 0004 7229 3519
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Abstract:
Graphics processing units (GPUs) are used as accelerators for algorithms in which the same instructions are carried out on different data. Algorithms for sparse linear algebra can achieve good performance on GPU, although they tend to have an irregular pattern of accesses to memory. The performance of these algorithms is highly dependent on input data. In fact, the parallelism these algorithms can achieve is limited by the opportunities for concurrency given by the data. Focusing on the solution of sparse riangular linear systems of equations, this thesis shows that a good partitioning of the data and a good scheduling of the computation can greatly improve performance on GPUs. For this class of algorithms, a partition of the data that maximises concurrency in the execution does not necessarily achieve the best performance. Instead, improving data locality by reducing concurrency reduces the latency of memory access and consequently the execution time. First, this work characterises the problem formally using graph theory and performance models. Then, algorithms that can be used effectively to partition the data are described. These algoritms aim to balance concurrency and data locality automatically. This approach is evaluated experimentally on the solution of linear equations with the preconditioned conjugate gradient method. Also, the thesis shows that the proposed approach can be used in the case when a matrix changes during the execution of an algorithm from one iteration to the other, like in the simplex method. In this case, the approach proposed in this thesis allows to update the partition of the matrix from one iteration to the other. Finally, the algorithms and performance models developed in the thesis are used to discuss the limitations of the acceleration of the simplex method with GPUs.
Supervisor: Constantinides, George A. ; Kerrigan, Eric C. Sponsor: Siemens AG
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.739683  DOI:
Share: