Use this URL to cite or link to this record in EThOS:
Title: Single and multi-modal image super-resolution : from sparse modelling to deep neural network
Author: Deng, Xin
ISNI:       0000 0004 9357 0861
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Single image super-resolution is a long studied inverse problem, which aims to infer a high-resolution (HR) image from a single low-resolution (LR) one. Sometimes, we can infer an HR image not only from its corresponding LR image, but also with the guidance of an image from a different modality, e.g., RGB guided depth image super-resolution. This is called multi-modal image super-resolution. In this thesis, we develop methods based on sparse modelling and deep neural network to tackle both the single and multi-modal image super-resolution problems. We also extend the applications to the general multi-modal image restoration and image fusion tasks. Firstly, we present a method for single image super-resolution, which aims to achieve the best trade-off between the objective and perceptual image quality. In our method, we show that the objective and perceptual quality are influenced by different elements of an image, and we use stationary wavelet transform to separate these elements. A novel wavelet domain style transfer algorithm is proposed to achieve the best trade-off between the image distortion and perception. Next, we develop a robust algorithm for RGB guided depth image super-resolution, through combining the finite rate of innovation (FRI) theory and a multi-modal dictionary learning algorithm. In addition, to speed up the super-resolution process, we introduce a projection-based rapid upscaling algorithm to pre-calculate the projections from the joint LR depth and HR intensity pairs to the HR depth. We demonstrate that this method only needs a fraction of training data but can achieve state-of-the-art performance and is resilient to noisy condition and unknown blurring kernels. For the general multi-modal image super-resolution problems, we propose a deep coupled ISTA network based on joint sparse modelling across modalities. The network architecture is derived by unfolding the iterative shrinkage and thresholding algorithm (ISTA) used in the model. Moreover, since the network initialization plays an important role in deep network training, we further propose a layer-wise optimization algorithm to initialize the network parameters before running the back-propagation algorithm. We demonstrate that the new initialization algorithm is effective, and the coupled ISTA network consistently outperforms other methods both quantitatively and qualitatively for various multi-modal scenarios, including RGB/depth, RGB/multi-spectral and RGB/near-infrared images. Finally, we extend the scope of our research from the multi-modal image super-resolution to the more general case of multi-modal image restoration (MIR) and multi-modal image fusion (MIF). We develop a novel deep Common and Unique information splitting network (CU-Net), whose network architecture is designed by drawing inspirations from a new proposed multi-modal convolutional sparse coding model. The CU-Net has good interpretability, and can automatically split the common information shared among different modalities, from the unique information owned by each single modality. We demonstrate the effectiveness of our method on a variety of MIR and MIF tasks, including RGB guided depth image super-resolution, flash guided non-flash image denoising, multi-focus and multi-exposure image fusion.
Supervisor: Dragotti, Pier Luigi Sponsor: Imperial College London
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral