Publication:
Efficient processing of XML messages

dc.contributor.author Choi, Ryan Hyun en_US
dc.date.accessioned 2022-03-20T14:20:55Z
dc.date.available 2022-03-20T14:20:55Z
dc.date.issued 2010 en_US
dc.description.abstract Over the past few years, there have been an increasing number of online-based applications such as news and stock-monitoring systems that exchange and store various types of data on the Internet. However, problems arise when these applications integrate semi-structured and heterogeneous data generated independently by Web service applications. To address this issue, XML was proposed, and it has now become the standard data exchange and storage format on the Internet. In this thesis, we analyse several problems that arise when building an XML publish/subscribe system. First, we look at the problem of expressing complex user queries for the system. While XQuery has become the standard for querying XML data, the complexity of XQuery has made itself not as successful as expected. To address this problem, we propose a visual XQuery specification language. By intuitive abstractions of XML and XQuery, our technique can generate XQuery queries for users that have little knowledge about the language. Second, we look at the problem of processing streaming XML data efficiently against a large number of branch XPath queries. To address query performance issues, we propose a technique that evaluates groups of similar branch queries simultaneously. Moreover, while join operations are being performed, our technique shares intermediate join results as much as possible amongst the queries in the same group. Furthermore, we also propose a technique to evaluate queries that contain multiple inter-document, value-based join operations. By reducing the overall number of join operations, experiments show that query performance is improved significantly. Third, we propose an X~IIL keyword search framework and algorithm that enable users to store and search useful messages received from multiple data sources. Our framework is small in size, and runs existing keyword search algorithms faster. In addition, we propose a labelling scheme, which compactly represents XML data, and supports all necessary operations required by keyword search algorithms efficiently. Lastly, we present compressed inverted lists based on our labelling scheme that runs search operations even faster, and supports updates. Extensive experiments show the effectiveness of our technique. en_US
dc.identifier.uri http://hdl.handle.net/1959.4/44937
dc.language English
dc.language.iso EN en_US
dc.publisher UNSW, Sydney en_US
dc.rights CC BY-NC-ND 3.0 en_US
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/au/ en_US
dc.source Thesis Digitisation Program en_US
dc.subject.other XML query interface en_US
dc.subject.other XML keyword search en_US
dc.title Efficient processing of XML messages en_US
dc.type Thesis en_US
dcterms.accessRights open access
dcterms.rightsHolder Choi, Ryan Hyun
dspace.entity.type Publication en_US
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.identifier.doi https://doi.org/10.26190/unsworks/14896
unsw.relation.faculty Engineering
unsw.relation.originalPublicationAffiliation Choi , Ryan Hyun, Computer Science & Engineering, Faculty of Engineering, UNSW en_US
unsw.relation.school School of Computer Science and Engineering *
unsw.thesis.degreetype PhD Doctorate en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
Choi-014954990.pdf
Size:
65.12 MB
Format:
application/pdf
Description:
Resource type