Use this URL to cite or link to this record in EThOS: http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.667677
Title: On the performance of markup language compression
Author: Kheirkhahzadeh, Antonio
ISNI:       0000 0004 5362 1206
Awarding Body: University of West London
Current Institution: University of West London
Date of Award: 2015
Availability of Full Text:
Access through EThOS:
Access through Institution:
Abstract:
Data compression is used in our everyday life to improve computer interaction or simply for storage purposes. Lossless data compression refers to those techniques that are able to compress a file in such ways that the decompressed format is the replica of the original. These techniques, which differ from the lossy data compression, are necessary and heavily used in order to reduce resource usage and improve storage and transmission speeds. Prior research led to huge improvements in compression performance and efficiency for general purpose tools which are mainly based on statistical and dictionary encoding techniques. Extensible Markup Language (XML) is based on redundant data which is parsed as normal text by general-purpose compressors. Several tools for compressing XML data have been developed, resulting in improvements for compression size and speed using different compression techniques. These tools are mostly based on algorithms that rely on variable length encoding. XML Schema is a language used to define the structure and data types of an XML document. As a result of this, it provides XML compression tools additional information that can be used to improve compression efficiency. In addition, XML Schema is also used for validating XML data. For document compression there is a need to generate the schema dynamically for each XML file. This solution can be applied to improve the efficiency of XML compressors. This research investigates a dynamic approach to compress XML data using a hybrid compression tool. This model allows the compression of XML data using variable and fixed length encoding techniques when their best use cases are triggered. The aim of this research is to investigate the use of fixed length encoding techniques to support general-purpose XML compressors. The results demonstrate the possibility of improving on compression size when a fixed length encoder is used to compress most XML data types.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID: uk.bl.ethos.667677  DOI: Not available
Keywords: Computer science, knowledge and information systems
Share: