Use this URL to cite or link to this record in EThOS:
Title: Data Reuse and Parallelism in Hardware Compilation
Author: Liu, Qiang
ISNI:       0000 0004 2681 4483
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2009
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis presents a methodology to automatically determine a data memory organisation at compiletime, suitable to exploit data reuse and loop-level parallelization, in order to achieve high performanceand low power design for data-dominated applications. Moore?s Law has enabled more and more heterogeneouscomponents integrated on a single chip. However, there are challenges to extract maximumperformance from these hardware resources efficiently. Unlike previous approaches, which mainly focus on making efficient use of computational resources,our focus is on data memory organisation and input-output bandwidth considerations, which are thetypical stumbling block of existing hardware compilation schemes. To optimize accesses to large off-chip memories, an approach is adopted and formalized to identify datareuse opportunities in local scratch-pad memory. An approach is presented for evaluating differentdata reuse options in terms of the memory space required by buffering reused data and execution timefor loading the data to the local memories. Determining the data reuse design option that consumesthe least power or performs operations quickest with respect to a memory constraint is a NP-hardproblem. In this work, the problem of data reuse exploration for low-power designs is formulated asa Multiple-Choice Knapsack problem. Together with a proposed power model, the problem is solvedefficiently. An integer geometric programming framework is presented for exploring data reuse andloop-level parallelization within a single step. The objective is to find the design that achieves theshortest execution time for an application. We describe our approaches based on formal optimization techniques, and present some results fromapplying these approaches to several benchmarks that show the advantages of optimizing data memoryorganisation and of exposing the interaction between data memory system design and parallelismextraction to the compiler.
Supervisor: Cheung, Peter ; Constantinides, George Anthony ; Masselos, Konstantinos Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral