Title:
|
Machine learning for childhood pneumonia diagnosis
|
Pneumonia is the number one killer of children under the age of 5, causing more deaths than malaria, tuberculosis and HIV/AIDS combined. In 2015, over 920,000 children died of pneumonia and more than 95% of the incidence and 99% of subsequent mortality occurred in developing countries. Current gold standard diagnostic assessment of childhood pneumonia relies on the use of advanced tools (such as X-rays and blood culture) by a clinical expert who assesses and interprets a combination of clinical measurements. However, the shortage of clinical expertise and equipment in low-resource settings delay diagnosis, increasing the risk of mortality. This thesis investigates the role that machine learning could play in improving diagnosis of childhood pneumonia in low-resource settings. Point-of-care devices such as a digital stethoscope and a pulse oximeter have become more affordable and hence abundant across the globe. The adoption of such devices, particularly in community settings, presents an opportunity to leverage information captured through their signals to provide early diagnosis for conditions such as childhood pneumonia. Hence, this thesis proposes a machine learning framework for the design of diagnostic models, built on a parsimonious set of symptoms which can be captured in a point-of-care setting, to deliver accurate and reproducible diagnosis of pneumonia. The approach is evaluated on three different datasets (with accuracy ranging between 81.8 - 97.9%), consistently outperforming diagnosis with the current gold standard guidelines for community diagnosis by the World Health Organisation. A clinical study in Mumbai, India, is designed to capture a realistic dataset in a poor urban community and a resource-constrained hospital, where a mobile health toolkit including three point-of-care devices (a digital stethoscope, a pulse oximeter and a thermometer) is built to enable minimally trained users to capture essential health information. Diagnostic models developed on this rich dataset are seen to perform well when validated internally as well as generalise to other datasets from different geographies. Finally, a set of tools for automated analysis of lung signals is proposed and validated on two big datasets, one from Peru and one from India. Combined with vital signs, information derived through this automated lung signal analysis was seen to outperform diagnosis via expert clinical annotation, delivering 81.3% sensitivity and 84.5% specificity.
|