Use this URL to cite or link to this record in EThOS:
Title: Automated methodologies for mapping convolutional neural networks on reconfigurable hardware
Author: Venieris, Stylianos
ISNI:       0000 0004 7659 0158
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Convolutional neural networks (ConvNets) are a family of machine learning models which have demonstrated state-of-the-art performance in a wide range of Artificial Intelligence (AI) tasks. To obtain accuracy gains, ConvNets have been typically enhanced either by designing deeper and wider models with a larger number of trainable parameters, or by designing novel components that introduce irregular dataflow. Both approaches are computationally expensive and pose challenges with respect to the deployment of ConvNets in real-life applications. ConvNet-enabled applications are also characterised by a variability across performance requirements, spanning from throughput-driven to latency-critical systems. This property calls for a model- and performance-aware design of computing systems in order to meet the diverse application-level specifications. Furthermore, in emerging complex AI systems, such as autonomous vehicles, ConvNets constitute mere building blocks of the overall system leading to multi-ConvNet settings. Upon deployment, the different models have to run concurrently, meet their respective performance constraints and share the underlying resources. This thesis proposes design methodologies and hardware architectures targeting field-programmable gate arrays (FPGAs) that address the aforementioned challenges, aiming for the high-performance deployment of ConvNets. The contributions of this work include: an analytical model for representing both ConvNet workloads and hardware mappings, together with a ConvNet-to-FPGA toolflow for the automated generation of ConvNet accelerators; a latency-driven methodology for the generation of latency-optimised hardware mappings which meet the stringent response-time constraints of modern ConvNet applications; novel architectural optimisations for state-of-the-art ConvNets with irregular connectivity, together with the corresponding mapping methodology; and a toolflow for the parallel deployment of multiple ConvNets on a single FPGA, enabling emerging multi-ConvNet applications. By applying the above methodologies to real-life workloads, it is shown that significant performance gains are achieved over existing state-of-the-art implementations on FPGAs and GPUs, enabling in this way the automated generation of ConvNet accelerators that are tailored to both the ConvNet-FPGA pair and the target performance requirements in single- and multi-ConvNet settings.
Supervisor: Bouganis, Christos-Savvas Sponsor: Engineering and Physical Sciences Research Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral