Publication:
Sentence level relation extraction via relation embedding

dc.contributor.author Huang, Haojie en_US
dc.date.accessioned 2022-03-23T15:45:24Z
dc.date.available 2022-03-23T15:45:24Z
dc.date.issued 2021 en_US
dc.description.abstract Relation extraction is a task of information extraction that extracts semantic relations from text, which usually occur between two named entities. It is a crucial step for converting unstructured text into structured data that forms a knowledge base, so that it may be used to build systems with special purposes such as business decision making and legal case-based reasoning. Relation extraction in sentence-level is the most common type, because relationships can be usually discovered within single sentences. One obvious example is the relationship between the subject and the object. As it has been studied for years, there are various methods for relation extraction such as feature based methods, distant supervision and recurrent neural networks. However, the following problems have been found in these approaches. (i) These methods require large amounts of human labelled data to train the model in order to get high accuracy. (ii) These methods are hard to be applied in real applications, especially in specialised domains where experts are required for both labelling and validating the data. In this thesis, we address these problems in two aspects: academic research and application development. In terms of academic research, we propose models that can be trained with less amount of labelled training data. The first approach trains the relation feature embedding, then it uses the feature embeddings for obtaining relation embeddings. To minimise the effect of designing handcraft features, the second approach adopts RNNs to automatically learn features from the text. In these methods, relation embeddings are reduced to a smaller vector space, and the relations with similar meanings form clusters. Therefore, the model can be trained with a smaller number of labelled data. The last approach adopts seq2seq regularisation, which can improve the accuracy of the relation extraction models. In terms of application development, we construct a prototype web service for searching semantic triples using relations extracted by third-party extraction tools. In the last chapter, we run all our proposed models on real-world legal documents. We also build a web application for extracting relations in legal text based on the trained models, which can help lawyers investigate the key information in legal cases more quickly. We believe that the idea of relation embeddings can be applied in domains that require relation extraction but with limited labelled data. en_US
dc.identifier.uri http://hdl.handle.net/1959.4/71079
dc.language English
dc.language.iso EN en_US
dc.publisher UNSW, Sydney en_US
dc.rights CC BY-NC-ND 3.0 en_US
dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/au/ en_US
dc.subject.other information retrieval en_US
dc.subject.other relation extraction en_US
dc.subject.other information extraction en_US
dc.title Sentence level relation extraction via relation embedding en_US
dc.type Thesis en_US
dcterms.accessRights open access
dcterms.rightsHolder Huang, Haojie
dspace.entity.type Publication en_US
unsw.accessRights.uri https://purl.org/coar/access_right/c_abf2
unsw.identifier.doi https://doi.org/10.26190/unsworks/22705
unsw.relation.faculty Engineering
unsw.relation.originalPublicationAffiliation Huang, Haojie, School of Computer Science and Engineering, Engineering, UNSW en_US
unsw.relation.school School of Computer Science and Engineering *
unsw.thesis.degreetype PhD Doctorate en_US
Files
Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
public version.pdf
Size:
5.81 MB
Format:
application/pdf
Description:
Resource type