Efficient processing of XML messages

Choi, Ryan Hyun

doi:10.26190/unsworks/14896

Efficient processing of XML messages

Download files

Access & Terms of Use

open access
Copyright: Choi, Ryan Hyun

CC BY-NC-ND 3.0

Abstract

Over the past few years, there have been an increasing number of online-based applications such as news and stock-monitoring systems that exchange and store various types of data on the Internet. However, problems arise when these applications integrate semi-structured and heterogeneous data generated independently by Web service applications. To address this issue, XML was proposed, and it has now become the standard data exchange and storage format on the Internet. In this thesis, we analyse several problems that arise when building an XML publish/subscribe system. First, we look at the problem of expressing complex user queries for the system. While XQuery has become the standard for querying XML data, the complexity of XQuery has made itself not as successful as expected. To address this problem, we propose a visual XQuery specification language. By intuitive abstractions of XML and XQuery, our technique can generate XQuery queries for users that have little knowledge about the language. Second, we look at the problem of processing streaming XML data efficiently against a large number of branch XPath queries. To address query performance issues, we propose a technique that evaluates groups of similar branch queries simultaneously. Moreover, while join operations are being performed, our technique shares intermediate join results as much as possible amongst the queries in the same group. Furthermore, we also propose a technique to evaluate queries that contain multiple inter-document, value-based join operations. By reducing the overall number of join operations, experiments show that query performance is improved significantly. Third, we propose an X~IIL keyword search framework and algorithm that enable users to store and search useful messages received from multiple data sources. Our framework is small in size, and runs existing keyword search algorithms faster. In addition, we propose a labelling scheme, which compactly represents XML data, and supports all necessary operations required by keyword search algorithms efficiently. Lastly, we present compressed inverted lists based on our labelling scheme that runs search operations even faster, and supports updates. Extensive experiments show the effectiveness of our technique.

Persistent link to this record

http://hdl.handle.net/1959.4/44937

DOI

https://doi.org/10.26190/unsworks/14896

Author(s)

Choi, Ryan Hyun

Publication Year

2010

Resource Type

Thesis

Degree Type

PhD Doctorate

UNSW Faculty

Files

Choi-014954990.pdf

65.12 MB

Adobe Portable Document Format

View full record Show statistics

Library

Efficient processing of XML messages

Access & Terms of Use

Altmetric

Abstract

Persistent link to this record

DOI

Link to Publisher Version

Link to Open Access Version

Additional Link

Author(s)

Supervisor(s)

Creator(s)

Editor(s)

Translator(s)

Curator(s)

Designer(s)

Arranger(s)

Composer(s)

Recordist(s)

Conference Proceedings Editor(s)

Other Contributor(s)

Corporate/Industry Contributor(s)

Publication Year

Resource Type

Degree Type

UNSW Faculty

Files

Related dataset(s)