Efficient algorithms for graph all-matching and containment search

Download files
Access & Terms of Use
open access
Copyright: Zhu, Gaoping
Altmetric
Abstract
Recent advances in database research have shown the potential of graph in modelling complicated data. Graph data have pervaded many modern applications including bio-informatics, chem-informatics, semantic webs, software engineering, etc. Efficient algorithms are strongly demanded in these applications to manage and analyze graph data. Thesis studies three fundamental problems in managing and analyzing graph data by proposing efficient query processing algorithms. These problems include (1) similarity all-matching (2) supergraph containment search and (3) subgraph containment search. We first study the problem of similarity all-matching to retrieve all similarity matches of a given query graph in a data graph with the number of missing edges bounded by a given similarity threshold. The only existing work proposes an enumerate-search paradigm. Our first approach to similarity all-matching proposes a novel query decomposition based hierarchical framework to effectively reduce the number of intermediate matches and share the intermediate matches produced in the enumerate-search paradigm. Our second approach to similarity all-matching proposes a novel tree based spanning search paradigm to effectively reduce the number of enumerated seeds and the corresponding searching processes. Our second problem, supergraph containment search, is to retrieve all the data graphs contained by a given query graph from a graph database. Previous works follow the filtering-verification framework to first filter most negative answers and then conduct verification on a small number of surviving candidates. Observing that the framework is significantly challenged by the redundant computation between the filtering phase and the verification phase, this thesis proposes a novel computation-sharing framework to maximize the computation sharing between the two phases. Our last problem, subgraph containment search, is to retrieve all data graphs containing a given query graph from a graph database. The existing indexing techniques not only fail to address the redundant computation in the filtering phase but also are inefficient to be directly applied on some important variant problems of subgraph containment search. Motivated by these, this thesis proposes an efficient query processing algorithms to reduce the filtering cost in subgraph containment search and then develops efficient query processing algorithms for two important variations of subgraph containment search.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Zhu, Gaoping
Supervisor(s)
Lin, Xuemin
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2012
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download whole.pdf 1.14 MB Adobe Portable Document Format
Related dataset(s)