Learning node embedding from graph structure and node attributes

Download files
Access & Terms of Use
open access
Embargoed until 2023-10-28
Copyright: Hao, Yu
Learning node embedding for graphs has been proved essential for a wide range of applications, from recommendation to community search. However, most existing approaches mainly focus on either the graph structure information or the attribute (feature) information, and thus cannot make full use of the information of the graph data. In this thesis, we study four typical problems on node embedding in graphs, and utilize both structural and attribute information of the original graph to learn representative node embeddings to solve various tasks. Firstly, we investigate the problem of inductive link prediction on attributed graphs. Many real-world applications require inductive prediction for new nodes having only attribute information. It is more challenging since the new nodes do not have structure information and cannot be seen during the model training. To solve this problem, we propose a model called DEAL, which is versatile in the sense that it works for both inductive and transductive link prediction. Extensive experiments on several benchmark datasets show that our proposed model significantly outperforms existing inductive link prediction methods, and also outperforms the state-of-the-art methods on transductive link prediction. Secondly, we focus on the data imbalance problem of link prediction. Positive Unlabeled (PU) learning is a good choice to address this issue since only positive links are available. Unfortunately, the unknown class prior and data imbalance of graphs impede the use of PU learning. To deal with these issues, we propose a novel model-agnostic PU learning algorithm for Graph Neural Network (GNN)-based link prediction via PU-AUC optimization. The proposed method is free of class prior estimation and able to handle the data imbalance. Moreover, we propose an accelerated method to reduce the operational complexity of PU-AUC optimization from quadratic to linear. Extensive experiments back up our theoretical analysis and validate that the proposed method is capable of boosting the performance of the state-of-the-art GNN-based link prediction models. Thirdly, we focus on generating representative node embedding on dynamic graphs. Most of the existing graph embedding methods assume that the graph is static, where nodes and edges are fixed. However, in practice, real-world graphs (e.g., social networks, citation networks, and transportation networks) are often dynamic, meaning that both the topology and node attributes of the graph evolve over time. As a result, these methods ignore the evolution history and cannot perform well in dynamic graphs. Last but not least, we study the graph keyword search problem which aims to find subtrees or subgraphs containing all query keywords ranked according to some criteria. Existing studies all assume that the graphs have complete information. However, real-world graphs may usually contain some missing information (such as edges or keywords), thus making the problem much more challenging. To solve this, we propose a novel model named KS-GNN based on the graph neural network and the auto-encoder. By considering the latent relationships and the frequency of different keywords, KS-GNN alleviates the effect of missing information and is able to learn low-dimensional representative node embeddings that preserve both graph structure and keyword features. The experiments on four real-world datasets show that our model consistently achieves better performance than state-of-the-art baseline methods in graphs having missing information.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Hao, Yu
Cao, Xin
Lin, Xuemin
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
Resource Type
Degree Type
PhD Doctorate
UNSW Faculty
download public version.pdf 3.35 MB Adobe Portable Document Format
Related dataset(s)