Use this URL to cite or link to this record in EThOS:
Title: Instance directed tuning for sparse matrix kernels on reconfigurable accelerators
Author: Grigoras, Paul
ISNI:       0000 0004 7427 782X
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
We present a novel method to optimise sparse matrix kernels for reconfigurable accelerators, through instance directed tuning - the tuning of reconfigurable architectures based on a sparse matrix instance. First, we present two novel reconfigurable architectures for the Conjugate Gradient Method that are optimised based on the problem dimension and sparsity pattern. These architectures provide the context for illustrating the opportunities and challenges for tuning sparse matrix kernels, which guide the design of the proposed method. Second, we introduce CASK, a novel framework for sparse matrix kernels on reconfigurable accelerators. CASK is: (1) instance directed, since it can account for differences in the matrix instances to generate and select adequate architectures; (2) unified, as it can be applied to a broad range of kernels and optimisations; (3) systematic, since it can support optimisations at multiple levels of encompassed reconfigurable architectures; and (4) automated, since it can operate with minimal user input, encapsulating and simplifying the tuning process. Third, we demonstrate the benefits of the proposed approach, by applying it to the Sparse Matrix Vector Multiplication kernel: (1) to tune a novel parametric reconfigurable architecture, resulting in up to 2 times energy effciency gains compared to optimised GPU and Xeon Phi implementations; (2) to include a novel compression method for nonzero values, resulting in up to 2.5 times compression ratio compared to the Compressed Sparse Row format; and (3) to tune a novel architecture for the block diagonal sparsity pattern arising in the Finite Element Method, enabling larger problems to be supported with up to 3 times speedup compared to an optimised CPU implementation.
Supervisor: Luk, Wayne Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral