Efficient Data Modelling, Indexing and Processing in Large Datasets

Download files
Access & Terms of Use
open access
Copyright: Ghafouri, Maryam
Altmetric
Abstract
Many devices and applications in social networks and on-line services are producing, storing, and using description, location, and occurrence time of objects. There are various systems to study, model, index, and process a huge amount of data. In this thesis, we study graphs and publish/subscribe systems. Firstly, we study the problem of continuously updating top-k messages with the highest ranks, each of which contains all the requested keywords when the rank of a message calculates based on freshness and distance to query’s location. Since new incoming messages are arriving all the time and the score of existing top-k results are decreasing over time, providing the most recent information needs continuously computing and maintaining the best results. We propose an efficient indexing and matching method using keywords, location, and the most recent top-k results of queries. Secondly, we study the problem of the decomposition of (k,s)-core. As both the user engagement of nodes and the strength of relationships are important, the (k, s)-core model is proposed in the literature to discover strong communities. Nevertheless, the decomposition algorithm regarding (k,s)-core is not yet investigated. We propose (k,s)-core algorithms to decompose a graph into its hierarchical structures considering both user engagement and tie strength. We first present the basic (k,s)-core decomposition methods. Then, we propose the advanced algorithms DES and DEK which index the support of edges to enable higher-level cost-sharing in the peeling process. In addition, effective pruning strategies are applied to DES/DEK to further enhance performance. Moreover, we build a novel index based on the decomposition result and investigate efficient (k,s)-core query algorithm based on our index. Finally, we develop efficient algorithm for maintaining the (k, s)-core index of the dynamic graph where vertices and edges are inserted and deleted. The algorithm, uses pruning strategies by exploiting the lower and upper bounds of the core number. We define a new Smax core and develop an efficient method for updating (k,s) numbers of nodes.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Ghafouri, Maryam
Supervisor(s)
Lin, Xuemin
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2021
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 5.81 MB Adobe Portable Document Format
Related dataset(s)