Use this URL to cite or link to this record in EThOS:
Title: Black-box security : measuring black-box information leakage via machine learning
Author: Cherubin, Giovanni
ISNI:       0000 0004 8500 6953
Awarding Body: Royal Holloway, University of London
Current Institution: Royal Holloway, University of London
Date of Award: 2019
Availability of Full Text:
Access from EThOS:
Access from Institution:
Determining how much information about a secret is leaked by a system is one of the most fundamental questions in security and privacy. It gives roots to several fields, such as Cryptography and side channel studies, and it has countless applications, ranging from network traffic analysis attacks to program analysis. In this manuscript, we wish to measure the leakage (or security) of a system, considered as a black-box: we assume no knowledge of its internals, and we base our estimates on examples of secret inputs and respective outputs. We refer to this practice as Black-box security, which can be used whenever the system cannot be modelled formally (e.g., because its internals are too complex). Black-box security methods have been historically based on classical Statistics ideas, which although caused strong limitations: they required observing at least one example for each input-output combination, which does not scale to large real-world systems (e.g., they need several millions examples for a 10 bits input and 10 bits output), nor to those with continuous output. We here introduce new principles for Black-box security estimation, which originate from the Machine Learning (ML) theory. They are based on the following observation: measuring the leakage of a system is equivalent to estimating the error of an ML rule from a particular class: the universally consistent rules. This gives access to several new Black-box security estimators, which scale to large realworld systems, requiring fewer examples than previous methods. This also allows bringing from the ML literature: impossibility results, and the idea of features to improve an estimator's convergence. We apply these techniques to real-world problems, such as i) user location data, obfuscated with location-privacy mechanisms, and ii) for measuring the security of defences against a major traffic analysis attack, Webpage Fingerprinting (WF). Notably, the latter constitutes, to the best of our knowledge, the first security estimation method for generic WF defences, after roughly 15 years since this attack's introduction. We also suggest several extensions of the framework (e.g., continuous secret input space, and more general classes of adversaries), some of which inspired by recent advances in the ML theory (Conformal Prediction), and we envision future applications for our methods (e.g., Membership Inference attacks, and generic ML-based attacks).
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: Black-box security ; Machine learning ; Leakage ; Side channels ; Privacy ; Traffic analysis ; Universal consistency