Use this URL to cite or link to this record in EThOS:
Title: Energy efficient big data networks
Author: Al-Salim, Ali Mahdi Ali
ISNI:       0000 0004 7225 7462
Awarding Body: University of Leeds
Current Institution: University of Leeds
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
The continuous increase of big data applications in number and types creates new challenges that should be tackled by the green ICT community. Data scientists classify big data into four main categories (4Vs): Volume (with direct implications on power needs), Velocity (with impact on delay requirements), Variety (with varying CPU requirements and reduction ratios after processing) and Veracity (with cleansing and backup constraints). Each V poses many challenges that confront the energy efficiency of the underlying networks carrying big data traffic. In this work, we investigated the impact of the big data 4Vs on energy efficient bypass IP over WDM networks. The investigation is carried out by developing Mixed Integer Linear Programming (MILP) models that encapsulate the distinctive features of each V. In our analyses, the big data network is greened by progressively processing big data raw traffic at strategic locations, dubbed as processing nodes (PNs), built in the network along the path from big data sources to the data centres. At each PN, raw data is processed and lower rate useful information is extracted progressively, eventually reducing the network power consumption. For each V, we conducted an in-depth analysis and evaluated the network power saving that can be achieved by the energy efficient big data network compared to the classical approach. Along the volume dimension of big data, the work dealt with optimally handling and processing an enormous amount of big data Chunks and extracting the corresponding knowledge carried by those Chunks, transmitting knowledge instead of data, thus reducing the data volume and saving power. Variety means that there are different types of big data such as CPU intensive, memory intensive, Input/output (IO) intensive, CPU-Memory intensive, CPU/IO intensive, and memory-IO intensive applications. Each type requires a different amount of processing, memory, storage, and networking resources. The processing of different varieties of big data was optimised with the goal of minimising power consumption. In the velocity dimension, we classified the processing velocity of big data into two modes: expedited-data processing mode and relaxed-data processing mode. Expedited-data demanded higher amount of computational resources to reduce the execution time compared to the relaxed-data. The big data processing and transmission were optimised given the velocity dimension to reduce power consumption. Veracity specifies trustworthiness, data protection, data backup, and data cleansing constraints. We considered the implementation of data cleansing and backup operations prior to big data processing so that big data is cleansed and readied for entering big data analytics stage. The analysis was carried out through dedicated scenarios considering the influence of each V’s characteristic parameters. For the set of network parameters we considered, our results for network energy efficiency under the impact of volume, variety, velocity and veracity scenarios revealed that up to 52%, 47%, 60%, 58%, network power savings can be achieved by the energy efficient big data networks approach compared to the classical approach, respectively.
Supervisor: Elmirghani, Jaafar ; Taisir, El-Gorashi Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available