Use this URL to cite or link to this record in EThOS:
Title: Enhancing response time and reliability via speculative replication and redundancy
Author: Qiu, Zhan
ISNI:       0000 0004 6348 3572
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
Modern computer systems, such as cloud or server farms, have been widely deployed to provide cost-effective and high-performance services. However, with the increasing scale and complexity, it is not a trivial task to achieve high reliability and consistently low response times. To address these issues, concurrent request replication has emerged as an effective mechanism to achieve these goals. With replication, r ≥ 1 replicas of each request are spawned simultaneously, and the results of the first k (1 ≤ k ≤ r) replicas to complete are used. Replication can thus handle unpredictable failures and delays due to exceptional conditions, unless they occur to all replicas simultaneously. The main risk of replication is that it may negatively impact latency since it introduces extra load into the system. In this thesis we aim to capture the trade-off between these two conflicting effects of replication. We first focus on the k = 1 case, i.e., the system replies with the result from whichever replica completes successfully first. We investigate how replication can be used to exploit processing time variability to reduce response times. Next, we investigate the k = r configuration, a case known as fork-join queues, where each arriving job is split into r tasks, each of which is assigned to one of r parallel processors. The key difference with the k = 1 case is that the final result can only be delivered once all the tasks have been completed. In the last part, we generalize the analysis to systems that implement erasure encoding, where an object is stored by creating k fragments and storing r ≥ k encoded fragments so that the original object can be retrieved from any k of the r fragments. The models proposed have the advantage of being able to compute the response-time distribution, which is a significant advantage as mean response time guarantees are not sufficient in deadline-driven applications. In addition, these models are able to handle fairly general inter-arrival and service times. Extensive experimental results show that replication can be very effective in keeping the response-time tail short, but these benefits highly depend on the number of replicas, processing-time distribution, as well as on the system load, number of servers, and the statistical characteristics of the arrival process.
Supervisor: Harrison, Peter Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral