Efficient processing of advanced structural queries in graph databases

Zhao, Xiang

doi:10.26190/unsworks/16609

Efficient processing of advanced structural queries in graph databases

Download files

Access & Terms of Use

open access
Copyright: Zhao, Xiang

CC BY-NC-ND 3.0

Abstract

Recent decades witnessed a rapid proliferation of graph data, such as chemical structures and business processes. Structural queries are frequently issued in these domains, and hence, attract extensive attention. Due to data inconsistency and noise, a recent trend is to study similarity queries. Moreover, graphs may possess multiple attributes on vertices, imposing additional challenges. Hence, indexing and processing methods for answering advanced structural queries are strongly in demand. The thesis studies three types of emerging structural queries in large-scale graph databases, including graph similarity join, graph similarity search and graph substructure selection. Graph similarity join retrieves all similarity pairs from two graph collections such that their similarity satisfies a given threshold. Particularly, we incorporate graph edit distance as similarity measure. Inspired by the q-gram idea for string similarity queries, we introduce path-based q-grams on graphs, and establish a count filtering lower bound for generating candidates. An efficient algorithm is proposed to exploit both matching and mismatching features. We also present label and degree-based matching conditions to strengthen the filtering techniques. Leveraging offline index, we investigate efficient graph similarity search. Existing solutions employ fixed-size overlapping substructures, and thus, are susceptible to large vertex degrees and thresholds. Disparately, we present a partition-based approach by dividing graphs into variable-size non-overlapping partitions. The edit distance constraint is converted to a containment constraint for candidate generation. A dynamic pruning technique and an improved verification algorithm are also developed. In addition, a cost-aware graph partitioning technique is conceived to optimize the index. Equipping structural queries with rich semantics, substructure selection over multi-attributed graphs finds all subgraph isomorphic mappings such that each pair of matching vertices satisfy a set of selection conditions, each against an equality, range, or set containment operator on a vertex attribute. Existing techniques for single-labeled graphs are developed under the assumption of identical label matching, and thus, cannot handle the general cases. To this end, we propose a two-tier index via judiciously materializing feature mappings. We also devise a scalable indexing and query processing method exploiting execution plans with minimum cost.

Persistent link to this record

http://hdl.handle.net/1959.4/53213

DOI

https://doi.org/10.26190/unsworks/16609

Author(s)

Zhao, Xiang

Publication Year

2014

Resource Type

Thesis

Degree Type

PhD Doctorate

UNSW Faculty

Files

public copy.pdf

1.19 MB

Adobe Portable Document Format

View full record Show statistics

Library

Efficient processing of advanced structural queries in graph databases

Access & Terms of Use

Altmetric

Abstract

Persistent link to this record

DOI

Link to Publisher Version

Link to Open Access Version

Additional Link

Author(s)

Supervisor(s)

Creator(s)

Editor(s)

Translator(s)

Curator(s)

Designer(s)

Arranger(s)

Composer(s)

Recordist(s)

Conference Proceedings Editor(s)

Other Contributor(s)

Corporate/Industry Contributor(s)

Publication Year

Resource Type

Degree Type

UNSW Faculty

Files

Related dataset(s)