Knowledge-Guided Deep Reinforcement Learning for Interactive Recommendation

Download files
Access & Terms of Use
open access
Copyright: Chen, Xiaocong
The interactive recommendation aims to accommodate and learn from dynamic interactions between items and users to achieve responsiveness and accuracy in recommendation systems. Reinforcement learning is inherently advantageous for coping with dynamic/interactive environments and thus has attracted increasing attention in interactive recommendation research. However, most existing works tend to learn stationary user interests, while neglecting that they are dynamic in nature. The dissertation starts with the introduction of the recommendation system and its applications. This is then followed by the detailed literature review which covers three main related areas: Sequence-aware Recommendation, Interactive Recommendation and the Knowledge-aware Recommendation System. The dissertation also reviews the reinforcement learning based applications in recommendation systems and discuss the advantages and shortcomings. After that, this dissertation reports a general problem statement about the interactive recommendation system and the identified challenges to be tackled, including user dynamic interest modeling, and computational cost of reinforcement learning optimization, and performance degradation for reinforcement learning based recommendation systems. In particular, we propose a set of techniques and models for the improved interactive recommendation via reinforcement learning. We propose a new model for learning a distributed interaction embedding, which can capture user's dynamic interest in a compact and expressive manner. Inspired by the recent advance in Graph Convolutional Network and the knowledge-aware recommendation, we design a Knowledge-Guided deep Reinforcement learning (KGRL) model to harness the advantages of both reinforcement learning and knowledge graphs for the interactive recommendation. This model is implemented within the actor-critic network framework. It maintains a local knowledge network to guide decision-making process during the training phase and employs the attention mechanism to discover long-term semantics between items. To reduce the computational cost of reinforcement learning, we take a further step to design an enhanced optimization strategy by narrowing down the space of updating steps and turning the reward function. We have conducted comprehensive experiments in a simulated online environment for the three proposed methods, which show consistent improved performance of our models against the baselines and state-of-art methods in the literature. Finally, this dissertation discusses the future work and potential further improvement for interactive recommendation systems. Declaration relating to disposition of project
Persistent link to this record
Link to Publisher Version
Additional Link
Chen, Xiaocong
Wei, Liu
Wenjie, Zhang
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
Resource Type
Degree Type
Masters Thesis
UNSW Faculty
download public version.pdf 1.85 MB Adobe Portable Document Format
Related dataset(s)