Sentence level relation extraction via relation embedding

Download files
Access & Terms of Use
open access
Copyright: Huang, Haojie
Altmetric
Abstract
Relation extraction is a task of information extraction that extracts semantic relations from text, which usually occur between two named entities. It is a crucial step for converting unstructured text into structured data that forms a knowledge base, so that it may be used to build systems with special purposes such as business decision making and legal case-based reasoning. Relation extraction in sentence-level is the most common type, because relationships can be usually discovered within single sentences. One obvious example is the relationship between the subject and the object. As it has been studied for years, there are various methods for relation extraction such as feature based methods, distant supervision and recurrent neural networks. However, the following problems have been found in these approaches. (i) These methods require large amounts of human labelled data to train the model in order to get high accuracy. (ii) These methods are hard to be applied in real applications, especially in specialised domains where experts are required for both labelling and validating the data. In this thesis, we address these problems in two aspects: academic research and application development. In terms of academic research, we propose models that can be trained with less amount of labelled training data. The first approach trains the relation feature embedding, then it uses the feature embeddings for obtaining relation embeddings. To minimise the effect of designing handcraft features, the second approach adopts RNNs to automatically learn features from the text. In these methods, relation embeddings are reduced to a smaller vector space, and the relations with similar meanings form clusters. Therefore, the model can be trained with a smaller number of labelled data. The last approach adopts seq2seq regularisation, which can improve the accuracy of the relation extraction models. In terms of application development, we construct a prototype web service for searching semantic triples using relations extracted by third-party extraction tools. In the last chapter, we run all our proposed models on real-world legal documents. We also build a web application for extracting relations in legal text based on the trained models, which can help lawyers investigate the key information in legal cases more quickly. We believe that the idea of relation embeddings can be applied in domains that require relation extraction but with limited labelled data.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Huang, Haojie
Supervisor(s)
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2021
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 5.81 MB Adobe Portable Document Format
Related dataset(s)