Use this URL to cite or link to this record in EThOS:
Title: Comparing and compressing fuzzy concepts : methods and application
Author: Abd Rahim, Noor Hafhizah
ISNI:       0000 0004 5924 2100
Awarding Body: University of Bristol
Current Institution: University of Bristol
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
In recent years, the volume of data has risen so rapidly due to the Internet and World Wide Web development. This phenomenon called information overload or digital obesity has caused data explosion and may lead to storage problems in the future. Many forms of data are stored and transmitted via internet including textual data. Textual data, which is usually in unstructured form can be processed or mined to yield useful information. In order to represent that, we need to know the underlying concepts. The most suitable approach to model the concepts is to design an ontology. Formal Concept Analysis (FCA) is complementary to the ontology approach, and provides a hierarchical structure of the concepts. However, an ontology is a fixed structure which does not change; in contrast, data is typically updated from day to day. The focus of this research is quantifying the changes in the content and structure of these concept hierarchies. it is beneficial if we quantify the changes. There are two types of measurements. The first measures the changes between two lattices which have identical sets of objects, but disjoint sets of attributes. We pair the overlapped concepts and compute the cost to transform each concept to its counterpart. We adapt the Levenstein distance to measure the changes. The second is Support-based Distance measurement, where we quantify the change in two lattices which have different sets of objects but the same set of attributes. We compute the support (or relative cardinality) for each concept's extension. Nowadays, online shopping becomes more common, and many customers, retailers, and manufacturers give attention to the product reviews. Because of that, we apply both measurements to an illustrative application using product review datasets. We monitor the differences between positive and negative sentiment orientations based on a product over fixed period of time using Edit Distance measurement. Additionally, we track the changes between lattices which represent the sentiment orientation on a product in two different time periods using Support-based Distance measurement. The phenomenon of information overload leads to problems using FCA, as it can be difficult to read the lattices and very costly to compute them. These large datasets are often high-dimensional datasets. We enhance an approach to select the important dimensions using Principal Component Analysis (PCA) through the Singular Value Decomposition (SVD) method, so that FCA computation becomes more tractable.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available