Use this URL to cite or link to this record in EThOS:
Title: Towards efficient big data processing in data centres
Author: Mai, Luo
ISNI:       0000 0004 7655 3808
Awarding Body: University of London
Current Institution: Imperial College London
Date of Award: 2018
Availability of Full Text:
Access from EThOS:
Access from Institution:
Large data processing systems require a high degree of coordination, and exhibit network bottlenecks due to massive communication data. This motivates my PhD study to propose system control mechanisms that improve monitoring and coordination, and efficient communication methods by bridging applications and networks. The first result is Chi, a new control plane for stateful streaming systems. Chi has a control loop that embeds control messages in data channels to seamlessly monitor and coordinate a streaming pipeline. This design helps monitor system and application-specific metrics in a scalable manner, and perform complex modification with on-the-fly data. The behaviours of control messages are customisable, thus enabling various control algorithms. Chi has been deployed into production systems, and exhibits high performance and scalability in test-bed experiments. With effective coordination, data-intensive systems need to remove network bottlenecks. This is important in data centres as their networks are usually over-subscribed. Hence, my study explores an idea that bridges applications and networks for accelerating communication. This idea can be realised (i) in the network core through a middlebox platform called NetAgg that can efficiently execute application-specific aggregation functions along busy network paths, and (ii) at network edges through a server network stack that provides powerful communication primitives and traffic management services. Test-bed experiments show that these methods can improve the communication of important analytics systems. A tight integration of applications and networks, however, requires an intuitive network programming model. My study thus proposes a network programming framework named Flick. Flick has a high-level programming language for application-specific network services. The services are compiled to dataflows and executed by a high-performance runtime. To be production-friendly, this runtime can run in commodity network elements and guarantee fair resource sharing among services. Flick has been used for developing popular network services, and its performance is shown in real-world benchmarks.
Supervisor: Costa, Paolo Sponsor: Google
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral