Use this URL to cite or link to this record in EThOS:
Title: A data-driven text mining and semantic network analysis for design information retrieval
Author: Shi, Feng
ISNI:       0000 0004 7658 7012
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Data-Driven Design is an emerging area with the advent of big-data tools. Massive information stored in electronic and digital forms on the internet provides potential opportunities for knowledge discovery in the fields of design and engineering. The aim of the research reported in this thesis is to facilitate the design information retrieval process based on large-scale electronic data through the use of text mining and semantic network techniques. We have proposed a data-driven pipeline for design information retrieval including four elements, from data acquisition, text mining, semantic network analysis, to data visualisation and user interaction. Web crawling techniques are applied to fetch massive online textual data in data acquisition process. The use of text mining enables the transformation of data from unstructured raw texts into a structured semantic network. A retrieval analysis framework is proposed based on the constructed semantic network to retrieve relevant design information and provoke design innovation. Finally, a web-based platform B-Link has been developed to enable user to visualise the semantic network and interact with it through the proposed retrieval analysis framework. Seven case studies were conducted throughout the thesis to investigate the effectiveness and gain insights for each element of the pipeline. Thousands of design post news items and millions of engineering and design peer reviewed papers can be efficiently captured by web crawling techniques. Through the use of itemset mining and noun phrase chunking, a semantic network constructed based on these textual data is shown to capture more inherent design- and engineering-oriented concepts and relations, compared to the benchmarking approaches: WordNet, ConceptNet, NeLL and Wikipedia. A retrieval analysis framework has been developed with different retrieval behaviours to retrieve either common general or domain-specific concepts, explicit or implicit knowledge relations, which are found to satisfy various knowledge demands in our real design projects at the conceptual stage. Finally, the result of a user test is shown to be consistent with these findings.
Supervisor: Childs, Peter ; Aurisicchio, Marco Sponsor: China Scholarship Council
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral