Use this URL to cite or link to this record in EThOS:
Title: Detection of microcosms on Twitter
Author: Inuwa-Dutse, Isa
ISNI:       0000 0004 8502 1889
Awarding Body: Edge Hill University
Current Institution: Edge Hill University
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Access from Institution:
A network is a composition of many sub-networks or communities with distinct and overlapping properties. Because similarity breeds attraction and interaction, a community constitutes of sets of nodes and edges with a stronger relationship that is expressed as a function of relatedness. Network communities provide a crucial organising principle, which enables a better understanding of the structure and function of complex networks. Depending on the network type, communities come in various forms - from biologically- to technologically-induced communities. Of technologically-induced communities, social networks or social media platforms such as Twitter and Facebook support a myriad of diverse users to remain connected, leading to a highly connected and dynamic social media ecosystem. Within this complex ecosystem, multiple types of communications happen at various layers of granularity and intensity, leading to the formation of communities. The task of identifying embedded communities within a network has been of great interests for various reasons because a community is a functional unit of a network that captures local relationship among the network objects. Community detection paradigm involves prediction and quantification processes to identify and explain community structures in a network. Establishing the equivalence of network entities is achieved either based on (1) the equivalent units with the same connection pattern to the same neighbours and (2) the equivalent units have the same or similar connection pattern to different neighbours. Accordingly, communities are further formed around two primary modalities or sources of information: network structure and features or attributes of nodes. However, existing studies mostly focus on one aspect and the few studies based on a bi-modal source are limited in the use of a shallow set of features. In the context of Twitter, while many community detection algorithms have been proposed in the past, detection of socially cohesive communities still poses some challenges with respect to mining-related tasks. These challenges are due to (1) flexibility of interaction in social media, leading to a vast amount of content - relevant and irrelevant (2) a form of logical social dichotomy that favours content from popular users to dominate (3) the ability to automate users' accounts and remain anonymous (4) the eccentricity of connection on Twitter contributes to identifying many socially unrelated users and encourage the propagation of spurious content. Noting the challenges mentioned above, the thesis presents an effective detection method. The central themes in the research relate to the problems of identifying genuine content and detection of socially cohesive groups. The problem of identifying genuine content is tackled using a novel approach (SPD strategy) designed to filter out irrelevant content, while the problem of community detection is formulated to focus on smaller groups, which are homogeneous to many sociodemographic behavioural, and intrapersonal characteristics. Essentially, the research proposed a multilevel clustering technique (MCT) that leverages both structural and textual aspects to identify local communities termed microcosms. By recognising the harmful effect of social media spam and fake content towards undermining credible research based on analysing social media data, the thesis contributed a useful content filtering system. As a precautionary measure to avoid compromising the research outcome by irrelevant or unrepresentative data, the SPD strategy offers crucial insights into the sophisticatedly evolving techniques of spamming on Twitter. As a result, the detection of socially cohesive communities will be enhanced, thus providing a useful analysis tool and strengthening the validity of online content. The proposed MCT provides a useful, scalable framework to identify sub-groups in a network. The experimental results from the MCT and evaluation on benchmark models and datasets demonstrate the efficacy of the approach. Through this research work, a new dimension for the detection of cohesive communities on Twitter is contributed. The thesis contributes to the literature by offering better understanding and clarity toward describing how low-level communities of users evolve and behave on Twitter. Moreover, by identifying communities of users with strong cohesion, a well-informed recommendation that recognises structural and content similarities can be achieved.
Supervisor: Korkontzelos, Ioannis ; Rizzuto, Franco ; Liptrott, Mark Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available