Engineering

Publication Search Results

Now showing 1 - 1 of 1
  • (2022) Zhao, Gengda
    Thesis
    Bipartite graphs are extensively used to model relationships between two different types of entities. In many real-world bipartite graphs, relationships are naturally uncertain due to various reasons such as data noise, measurement error and imprecision of data, leading to uncertain bipartite graphs. In this thesis, we propose the (\alpha,\beta,\eta)-core model, which is the first cohesive subgraph model on uncertain bipartite graphs. To capture the uncertainty of relationships/edges, \eta-degree is adopted to measure the vertex engagement level, which is the largest integer k such that the probability of a vertex having at least k neighbors is not less than \eta. Given degree constraints \alpha and \beta, and a probability threshold \eta, the (\alpha,\beta,\eta)-core requires that each vertex on the upper or lower level have \eta-degree no less than \alpha or \beta, respectively. An (\alpha,\beta,\eta)-core can be derived by iteratively removing a vertex with \eta-degree below the degree constraint and updating the \eta-degrees of its neighbors. This incurs prohibitively high cost due to the \eta-degree computation and updating, and it is not scalable to large bipartite graphs. This motivates us to develop index-based approaches. We propose a basic full index that stores (\alpha,\beta,\eta)-core for all possible \alpha, \beta, and \eta combinations, thus supporting optimal retrieval of the vertices in any (\alpha,\beta,\eta)-core. Due to its long construction time and high space complexity, we further propose a probability-aware index to achieve a balance between time and space costs. To efficiently build the probability-aware index, we design a bottom-up index construction algorithm and a top-down index construction algorithm. Extensive experiments are conducted on real-world datasets with generated edge probabilities under different distributions, which show that (1) (\alpha,\beta,\eta)-core is an effective model; (2) index construction and query processing are significantly sped up by the proposed techniques.