Use this URL to cite or link to this record in EThOS:
Title: Multi-version execution for increasing the reliability and availability of updated software
Author: Hosek, Petr
ISNI:       0000 0004 7233 0789
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2015
Availability of Full Text:
Access from EThOS:
Access from Institution:
Software updates are an integral part of the software development and maintenance process, but unfortunately they present a high risk, as new releases often introduce new bugs and security vulnerabilities; as a consequence, many users refuse to upgrade their software, relying instead on outdated versions, which often leave them exposed to known software bugs and security vulnerabilities. In this thesis we propose a novel multi-version execution approach, a variant of N-version execution, for improving the software update process. Whenever a new program update becomes available, instead of upgrading the software to the newest version, we run the new version in parallel with the old one, and carefully synchronise their execution to create a more reliable multi-version application. We propose two different schemes for implementing the multi-version execution technique—via failure recovery and via transparent failover—and we describe two possible designs for implementing these schemes: Mx, focused on recovering from crashes caused by the faulty software updates; and Varan, focused on running a large number of versions in parallel with a minimal performance overhead. Mx uses static binary analysis, system call interposition, lightweight checkpointing and runtime state manipulation to implement a novel fault recovery mechanism, which enables the recovery of the crashing version using the code of the other, non-crashing version. We have shown how Mx can be applied successfully to recover from real crashes in several real applications. Varan combines selective binary rewriting with high-performance event streaming to significantly reduce performance overhead, without sacrificing the size of the trusted computing base, nor flexibility or ease of debugging. Our experimental evaluation has demonstrated that Varan can run C10k network servers with low performance overhead, and can be used in various scenarios such as transparent failover and live sanitisation.
Supervisor: Cadar, Cristian Sponsor: Google
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral