Publication Search Results

Now showing 1 - 6 of 6
  • (2021) He, Yizhang
    A bipartite graph is a graph with two layers such that vertices in the same layer are not connected, which is widely used to model the relationships among two types of entities. Examples of bipartite graphs include author-paper networks, customer-product networks, and ecological networks (e.g., the predator-prey network and the plant-animal network). In bipartite graphs, cohesive subgraph computation is a fundamental problem that aims to find closely-connected subgraphs, which can be applied to group recommendations, network visualization, and fraud detection. In this thesis, we propose a novel cohesive subgraph model called τ -strengthened (α, β)- core (denoted as (α, β)τ -core), which is the first work to consider both tie strength and vertex engagement on bipartite graphs. An edge is a strong tie if contained in at least τ butterflies (2 x 2-bicliques). (α, β)τ -core requires each vertex on the upper or lower level to have at least α or β strong ties, given strength level τ. To retrieve the vertices of (α, β)τ -core optimally, we construct index Iα,β,τ to store all (α, β)τ -cores. Effective optimization techniques are proposed to improve index construction. To make our idea practical on large graphs, we propose 2D-indexes Iα,β, Iβ,τ, and Iα,τ that selectively store the vertices of (α, β)τ -core for some α, β, and τ . The 2D-indexes are more space-efficient and require less construction time, each of which can support (α, β)τ -core queries. As query efficiency depends on input parameters and the choice of 2D-index, we propose a learning-based hybrid computation paradigm by training a feed-forward neural network to predict the optimal choice of 2D-index that minimizes the query time. Extensive experiments show that (1) (α, β)τ -core is an effective model capturing unique and important the proposed techniques significantly improve the efficiency of index construction and query processing.

  • (2021) Zhang, Han
    Relation prediction is a fundamental task in network analysis which aims to predict the relationship between two nodes. Thus, this differes from the traditional link prediction problem predicting whether a link exists between a pair of nodes, which can be viewed as a binary classification task. However, in the heterogeneous information network (HIN) which contains multiple types of nodes and multiple relations between nodes, the relation prediction task is more challenging. In addition, the HIN might have missing relation types on some edges and missing node types on some nodes, which makes the problem even harder. In this work, we propose RPGNN, a novel relation prediction model based on the graph neural network (GNN) and multi-task learning to solve this problem. Existing GNN models for HIN representation learning usually focus on the node classification/clustering task. They require the type information of all edges and nodes and always learn a weight matrix for each type, thus requiring a large number of learning parameters on HINs with rich schema. In contrast, our model directly encodes and learns relations in HINs and avoids the requirement of type information during message passing in GNN. Hence, our model is more robust to the missing types for the relation prediction task on HINs. The experiments on real HINs show that our model can consistently achieve better performance than several state-of-the-art HIN representation learning methods.

  • (2022) Chen, Yuhui
    Ride-sourcing services are rapidly spreading around the world. The ride-sourcing service refers to a point-to-point on-demand ride service operated by various companies, which organize and coordinate drivers using their vehicles to provide passengers with ride services. How ride-sourcing services and public transport are interacting with each other and thus yielding system-wide impacts have not received sufficient attention. This thesis extends the literature by proposing multi-class, multi-modal traffic assignment models to optimize the transport system with the presence of ride-sourcing and public transport services. The first part of the thesis develops a stylized model with a simple network with single origin-destination pair in order to analytically examine the mode choice behavior of travelers and the operation strategies of a public transport operator and a ride-sourcing operator. In such a multi-modal system, users may travel by bus, train, or ride-sourcing service. In particular, we develop a tractable bi-level model that quantifies the user equilibrium travel choices in the lower-level, where the travel choice equilibrium can be formulated as a variational inequality problem, and optimizes the operation strategies of the public transport operator that aims to minimize total system cost and the ride-sourcing operator that aims to maximize its profit in the upper-level. The existence and uniqueness of the multi-modal travel choice equilibrium are also analyzed. How the operation decision variables might affect users' mode choices and system performance is investigated both analytically and numerically. The second part of the thesis extends the stylized model to a general network model, which includes also solo-driving, and multiple OD pairs to depict a more realistic problem setting. The general network model is applied on a case study in the context of Sydney. The existence and uniqueness are also investigated for the general network model. The method of Frank-Wolfe combined with diagonalization is applied to generate numerical solutions, and illustrate the analytical observations and generate further understanding. The results show that the total system cost can be reduced while the profit of the ride-sourcing company can be increased under appropriate operating strategies of the public transport operator and the ride-sourcing operator.

  • (2022) Al-Farsi, Mohammed Said Saleh
    Multijunction solar cells based on silicon are predicted to achieve an efficiency of 40-45% for a top cell with a band gap of 1.6-1.9 eV. However, there are currently no known materials with suitable band gaps able to deliver high efficiencies. Two classes of materials that have been proposed for top cells are alloys of CuGaSe2 and alloyed oxide perovskites. CuGaSe2 has a suitable band gap (1.68 eV) for a top cell on silicon, but the maximum efficiency achieved is only 11%, while that of the closely-related CuInGaSe2 (band gap 1.14 eV) is 23.35%. The low efficiency of CuGaSe2 has been attributed to anti-site defects. Therefore, suppressing this defect formation is critical to achieving higher efficiencies. On the other hand, most oxide perovskites have band gaps that are too high (>2 eV) to be used as top cells on silicon, hence strategies such as alloying are required to lower their band gaps. In this work, the effects of alloying CuGaSe2 with Ag, Na, K, Al, In, La and S were investigated using Density Functional Theory (DFT) calculations. The band gaps of the alloyed compounds and formation energies of anti-site defects were calculated to find alloying elements that can increase the defect formation energy but maintain the band gap. CuGaSe2 alloyed with Al at 50at% showed the highest increase (compared to unalloyed CuGaSe2) in the defect formation energy (by ~0.20 eV) followed by Na (~0.15 eV) and S (~0.10 eV), both at 50at%. However, the band gap of the Al alloy (~2.15 eV) is too high for a top cell, while those of Na (~1.95 eV) and S (~1.91 eV) are slightly above the upper limit. Thus, alloying with these elements is not an ideal route towards significantly increasing the formation energy of anti-site defects while maintaining the band gap of CuGaSe2. However, some of the factors that influence the defect formation energy are identified, potentially leading to design rules for future work. Defect formation energies were found to be higher in structures with more positively charged Ga and negatively charged Se atoms. Analysis of bond lengths revealed a positive correlation between shorter Ga and Se bonds and higher defect formation energies. Band gaps of various alloyed oxide perovskites were calculated using DFT. BiFeO3 was alloyed with Y and Sb; LaFeO3 with Cr and Sb and YFeO3 with Bi and Sb. YFeO3 alloyed with Sb at 50at%, was found to have a band gap of 1.4-2.1 eV (depending on the basis set used) which is in the range for a top cell.

  • (2022) Wang, Zishan
    Eye movement detection, separating the eye positions into distinct oculomotor events such as saccade and fixation, has been associated with cognitive load classification, referring to the process of estimating the mental effort involved with a certain task. However, there exist three questions remaining to be answered for wearable applications: (i) will algorithms originally developed for fixation and saccade detection from gaze positions give similar accuracy from pupil center positions, particularly when the head is not fixed?; (ii) how much improvement to the performance of cognitive load classification can be achieved by separating fixation and saccade?; and (iii) will the fixation- and saccade-related measure be affected by differing cognitive load processes from diverse task designs? Regarding the first research question, three representative saccade detection algorithms are applied to both pupil center positions and gaze positions collected with and without head movement, and their performance is evaluated against a stimulus-based ground truth under different measures. Results from a novel dataset recorded using wearable infrared cameras indicate that saccade/fixation detection using pupil center positions generally pro- vides better performance than using gaze positions with an 8.6% improvement in Cohen’s Kappa. Regarding the second and third research questions, statistical tests of several pupil-related measures extracted from all samples, fixation-only samples and saccade-only samples are evaluated for varied cognitive load levels, which indicate that pupil-related measures from fixation-only samples can be used as a substitute for those from all samples in distinguish- ing different levels of cognitive loads. From the statistical test results of several fixation- and saccade-related measures across two task types, the possibility for such measures to distinguish varied cognitive load levels, together with their trends among varied cognitive load levels are different under varied cognitive load processes. Furthermore, for the cognitive load classification systems trained with and without fixation- and saccade-related features, accuracy can be improved by 14.0%-23.4% for a random forest classifier across two different task types by including fixation and saccade-related features. In general, this thesis contributes to fixation and saccade based cognitive load classification research by demonstrating that pupil center positions can be used as an alternative to gaze positions for fixation and saccade detection in a wearable context, and moreover, fixation and saccade separation can improve the cognitive load classification performance.

  • (2022) Zhao, Gengda
    Bipartite graphs are extensively used to model relationships between two different types of entities. In many real-world bipartite graphs, relationships are naturally uncertain due to various reasons such as data noise, measurement error and imprecision of data, leading to uncertain bipartite graphs. In this thesis, we propose the (\alpha,\beta,\eta)-core model, which is the first cohesive subgraph model on uncertain bipartite graphs. To capture the uncertainty of relationships/edges, \eta-degree is adopted to measure the vertex engagement level, which is the largest integer k such that the probability of a vertex having at least k neighbors is not less than \eta. Given degree constraints \alpha and \beta, and a probability threshold \eta, the (\alpha,\beta,\eta)-core requires that each vertex on the upper or lower level have \eta-degree no less than \alpha or \beta, respectively. An (\alpha,\beta,\eta)-core can be derived by iteratively removing a vertex with \eta-degree below the degree constraint and updating the \eta-degrees of its neighbors. This incurs prohibitively high cost due to the \eta-degree computation and updating, and it is not scalable to large bipartite graphs. This motivates us to develop index-based approaches. We propose a basic full index that stores (\alpha,\beta,\eta)-core for all possible \alpha, \beta, and \eta combinations, thus supporting optimal retrieval of the vertices in any (\alpha,\beta,\eta)-core. Due to its long construction time and high space complexity, we further propose a probability-aware index to achieve a balance between time and space costs. To efficiently build the probability-aware index, we design a bottom-up index construction algorithm and a top-down index construction algorithm. Extensive experiments are conducted on real-world datasets with generated edge probabilities under different distributions, which show that (1) (\alpha,\beta,\eta)-core is an effective model; (2) index construction and query processing are significantly sped up by the proposed techniques.