Use this URL to cite or link to this record in EThOS:
Title: Describing obstetric ultrasound video content using deep learning
Author: Gao, Yuan
Awarding Body: University of Oxford
Current Institution: University of Oxford
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
In recent years, advances in ultrasound technology have made devices cheaper and portable thus making the technology more accessible in both High Income Country (HIC) and Low and Middle Income Country (LMIC) settings. Meanwhile, there is an increasing amount of ultrasound scans that might not necessarily be performed by experienced sonographers. Automatic recognition of patterns in such scans can be difficult for traditional machine learning models trained with hand-crafted features, due to its high variability in terms of image quality and anatomies appearance. This doctoral thesis presents deep learning based methods for the automation of fetal structure recognition in free-hand obstetric ultrasound video. First, we demonstrate the feasibility of training deep convolutional neural networks (CNNs) for ultrasound image classification. It is worth noting that the challenge faced in this case is overfitting caused by limited training data. We show that the over-fitting of deep CNNs can be prevented by: (i) tailoring the architecture, for example by removing fully-connected layers; (ii) introducing data augmentation during training; and (iii) careful regularization. We also visualize the high level CNN features to understand the classification results which suggests that standard CNN architectures are not good enough for learning discriminative representations of complicated anatomy, for example the fetal heart, that shows high variability in terms of anatomical appearance and scale. Next, we address the challenge of fetal heart recognition by learning deep representations of ultrasound video that take into account temporal information. We inflate the standard CNN by adding a motion detection stream to the spatial stream. This novel two-stream CNN model demonstrates: (i) its capability of detection and localization of the fetal heart; (ii) significantly superior fetal heart recognition than standard CNNs; and (iii) the capability of describing fetal cardiac motion. Finally, we expand the capability of object detection to other important fetal structures of interest, such as the fetal head and abdomen. We present a hybrid model, consisting of CNNs and recurrent neural networks (RNNs), which can localize the target structures in short video sequences. In this model, we do not have object level annotation (e.g. bounding boxes) so the localization is achieved by class activation mapping (CAM). Additionally, a soft-attention mechanism is introduced into the representation learning to produce a spatial-temporal saliency map that is shown to be useful to highlight the object of interest, suggesting the potential as a video navigation cue. The methods described in this thesis contribute to the ultrasound video image analysis literature, and also understanding of how to design image analysis algorithms for potential use by minimally trained users of ultrasound devices in HIC and LMIC settings.
Supervisor: Noble, Alison Sponsor: Chinese Scholarship Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available