Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.650669
Title: Source-to-source compilation of loop programs for manycore processors
Author: Konstantinidis, Athanasios
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2013
Availability of Full Text:
Access through EThOS:
Full text unavailable from EThOS. Please try the link below.
Access through Institution:
Abstract:
It is widely accepted today that the end of microprocessor performance growth based on increasing clock speeds and instruction-level parallelism (ILP) demands new ways of exploiting transistor densities. Manycore processors (most commonly known as GPGPUs or simply GPUs) provide a viable solution to this performance scaling bottleneck through large numbers of lightweight compute cores and memory hierarchies that rely primarily on software for their efficient utilization. The widespread proliferation of this class of architectures today is a clear indication that exposing and managing parallelism on a large scale as well as efficiently orchestrating on-chip data movement is becoming an increasingly critical concern for high-performance software development. In such a computing landscape performance portability -- the ability to exploit the power of a variety of manycore chips while minimizing the impact on software development and productivity -- is perhaps one of the most important and challenging objectives for our research community. This thesis is about performance portability for manycore processors and how source-to-source compilation can help us achieve it. In particular, we show that for an important set of loop-programs, performance portability is attainable at low cost through compile-time polyhedral analysis and optimization and parametric tiling for run-time performance tuning. In other words, we propose and evaluate a source-to-source compilation path that takes affine loop-programs as input and produces parametrically tiled parallel code amenable to run-time tuning across different manycore platforms and devices -- a very useful and powerful property if we seek performance portability because it decouples the compiler from the performance tuning process. The produced code relies on a platform-independent run-time environment, called Avelas, that allows us to formulate a robust and portable code generation algorithm. Our experimental evaluation shows that Avelas induces low run-time overhead and even substantial speed-ups for wavefront-parallel programs compared to a state-of-the-art compile-time scheme with no run-time support. We also claim that the low overhead of Avelas is a strong indication that it can also be effective as a general-purpose programming model for manycore processors as we demonstrate for a set of ParBoil benchmarks.
Supervisor: Kelly, Paul Sponsor: Engineering and Physical Sciences Research Council ; Codeplay Software Ltd
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.650669  DOI: Not available
Share: