Use this URL to cite or link to this record in EThOS:
Title: A distributed stream library for Java 8
Author: Chan, Yu
ISNI:       0000 0004 5990 5111
Awarding Body: University of York
Current Institution: University of York
Date of Award: 2016
Availability of Full Text:
Access from EThOS:
Access from Institution:
An increasingly popular application of parallel computing is Big Data, which concerns the storage and analysis of very large datasets. Many of the prominent Big Data frameworks are written in Java or JVM-based languages. However, as a base language for Big Data systems, Java still lacks a number of important capabilities such as processing very large datasets and distributing the computation over multiple machines. The introduction of Streams in Java 8 has provided a useful programming model for data-parallel computing, but it is limited to a single JVM and still does not address Big Data issues. This thesis contends that though the Java 8 Stream framework is inadequate to support the development of Big Data applications, it is possible to extend the framework to achieve performance comparable to or exceeding those of popular Big Data frameworks. It first reviews a number of Big Data programming models and gives an overview of the Java 8 Stream API. It then proposes a set of extensions to allow Java 8 Streams to be used in Big Data systems. It also shows how the extended API can be used to implement a range of standard Big Data paradigms. Finally, it compares the performance of such programs with that of Hadoop and Spark. Despite being a proof-of-concept implementation, experimental results indicate that it is a lightweight and efficient framework, comparable in performance to Hadoop and Spark.
Supervisor: Wellings, Andy ; Gray, Ian Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available