Use this URL to cite or link to this record in EThOS:
Title: Fair, responsive scheduling of engineering workflows on computing grids
Author: Burkimsher, Andrew Marc
ISNI:       0000 0004 5356 4774
Awarding Body: University of York
Current Institution: University of York
Date of Award: 2014
Availability of Full Text:
Access from EThOS:
Access from Institution:
This thesis considers scheduling in the context of a grid computing system used in engineering design. Users desire responsiveness and fairness in the treatment of the workflows they submit. Submissions outstrip the available computing capacity during the work day, and the queue is only caught up on overnight and at weekends. The execution times observed span a wide range of 10^0 to 10^7 core-minutes. The Projected Schedule Length Ratio (P-SLR) list scheduling policy is designed to use execution time estimates and the structure of the dependency graph to improve on the existing industrial FairShare policy. P-SLR aims to minimise the worst-case SLR of jobs and keep SLR fair across the space of job execution times. P-SLR is shown to equal or surpass all other evaluated policies in responsiveness and fairness across the spectra of load and networking delays. P-SLR is also dominant where execution time estimates are within an order of magnitude of the real value. Such estimates are considered achievable using user knowledge or automated profiling. Outside this range, the Shortest Remaining Time First (SRTF) policy achieved better responsiveness and fairness. The Projected Value Remaining (PVR) policy considers the case where a curve specifying the value of a job over time is given. PVR aims to maximise total workload value, even under overload, by maximising the worst-case job value in a workload. PVR is shown to be dominant across the load and networking spectra. Where execution time estimates are coarser than the nearest power of 2, SRTF delivers higher value than PVR. SRTF is also shown to have responsiveness, fairness and value close behind P-SLR and PVR throughout the range of load and network delays considered. However, the kinds of starvation under overload incurred by SRTF would almost certainly be undesirable if implemented in a production system.
Supervisor: Bate, Iain ; Indrusiak, Leandro Soares Sponsor: Not available
Qualification Name: Thesis (D.Eng.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available