Use this URL to cite or link to this record in EThOS:
Title: Retrieving information from compressed XML documents according to vague queries
Author: AlHamadani, Baydaa
ISNI:       0000 0004 2704 4302
Awarding Body: University of Huddersfield
Current Institution: University of Huddersfield
Date of Award: 2011
Availability of Full Text:
Access from EThOS:
Access from Institution:
XML has become the standard way for representing and transforming data over the World Wide Web. The problem with XML documents is that they have a very high ratio of redundancy, which makes these documents demanding large storage capacity and high network band-width for transmission. Because of their extensive use, XML documents could be retrieved according to vague queries by naive users with poor background in writing XPath query. The aim of this thesis is to present the design of a system named “XML Compressing and Vague Querying (XCVQ)” which has the ability of compressing the XML document and retrieving the required information from the compressed version with less decompression required according to vague queries. XCVQ first compressed the XML document by separating its data into containers and then compress these containers using the GZip compressor. The compressed file could be retrieved if a vague query is submitted without the need to decompress the whole file. For the purpose of processing the vague queries, XCVQ decomposes the query according to the relevant documents and then a second decomposition stage is made according to the relevant containers. Only the required information is decompressed and submitted to the user. To the best of our knowledge, XCVQ is the first XML compressor that has the ability to process vague queries. The average compression ratio of the designed compressor is around 78% which may be considered competitive compared to other queriable XML compressors. Based on several experiments, the query processor part had the ability to answer different kinds of vague queries ranging from simple exact match queries to complex ones that require retrieving information from several compressed XML documents.
Supervisor: Lu, Joan ; Yip, Jim Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available
Keywords: ZA4050 Electronic information resources