Use this URL to cite or link to this record in EThOS:
Title: Performance and cost optimization of multi-tenant in-memory database clusters
Author: Molka, Karsten
ISNI:       0000 0004 7229 4482
Awarding Body: Imperial College London
Current Institution: Imperial College London
Date of Award: 2017
Availability of Full Text:
Access from EThOS:
Access from Institution:
In this thesis, we set focus on in-memory database systems and combine queueing network modeling with nonlinear optimization to capture their performance characteristics and to optimize their provisioning cost. Our work is motivated by the advances in big data processing and in-memory technologies, which have created a shift of resource usage patterns in data centers, making both resource and workload management more challenging. One reason for this lies in the complexity of in-memory applications, for which performance is difficult to capture with existing methods. These challenges are further exacerbated by on-demand database offerings and multi-tenant configurations, both of which can lead to increased workload dynamics. New accurate and efficient performance management methods are therefore key to handle workload interference effects and improve suboptimal resource configuration in data centers. The first part of this thesis proposes a methodology that tackles the above challenges by solving the problem of routing analytical requests to a set of in-memory databases, minimizing memory exhaustion. This is particularly important, in that it helps avoid memory swapping under workloads with large memory footprints. As part of our methodology, we also introduce a novel in-memory database performance model based on fork-join queues, which compared with existing approximations is both more accurate and suitable for large scale optimization. In the second part we focus on the performance analysis and resource allocation challenges that in-memory database providers face when optimizing their data center environments for large multi-tenant workloads. We set out by analyzing performance interference between multiple co-located databases and propose efficient models for capturing power consumption and probabilistic measures of memory occupancy. We then combine these models with a novel optimization strategy that tackles database consolidation problems with a new hybrid genetic algorithm, and we demonstrate its effectiveness in helping cloud providers increase the energy-efficiency of in-memory database clusters.
Supervisor: Casale, Giuliano ; Heinis, Thomas Sponsor: SAP (Firm)
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral