Sentence level relation extraction via relation embedding

Huang, Haojie

doi:10.26190/unsworks/22705

Publication:

Sentence level relation extraction via relation embedding

dc.contributor.author	Huang, Haojie	en_US
dc.date.accessioned	2022-03-23T15:45:24Z
dc.date.available	2022-03-23T15:45:24Z
dc.date.issued	2021	en_US
dc.description.abstract	Relation extraction is a task of information extraction that extracts semantic relations from text, which usually occur between two named entities. It is a crucial step for converting unstructured text into structured data that forms a knowledge base, so that it may be used to build systems with special purposes such as business decision making and legal case-based reasoning. Relation extraction in sentence-level is the most common type, because relationships can be usually discovered within single sentences. One obvious example is the relationship between the subject and the object. As it has been studied for years, there are various methods for relation extraction such as feature based methods, distant supervision and recurrent neural networks. However, the following problems have been found in these approaches. (i) These methods require large amounts of human labelled data to train the model in order to get high accuracy. (ii) These methods are hard to be applied in real applications, especially in specialised domains where experts are required for both labelling and validating the data. In this thesis, we address these problems in two aspects: academic research and application development. In terms of academic research, we propose models that can be trained with less amount of labelled training data. The first approach trains the relation feature embedding, then it uses the feature embeddings for obtaining relation embeddings. To minimise the effect of designing handcraft features, the second approach adopts RNNs to automatically learn features from the text. In these methods, relation embeddings are reduced to a smaller vector space, and the relations with similar meanings form clusters. Therefore, the model can be trained with a smaller number of labelled data. The last approach adopts seq2seq regularisation, which can improve the accuracy of the relation extraction models. In terms of application development, we construct a prototype web service for searching semantic triples using relations extracted by third-party extraction tools. In the last chapter, we run all our proposed models on real-world legal documents. We also build a web application for extracting relations in legal text based on the trained models, which can help lawyers investigate the key information in legal cases more quickly. We believe that the idea of relation embeddings can be applied in domains that require relation extraction but with limited labelled data.	en_US
dc.identifier.uri	http://hdl.handle.net/1959.4/71079
dc.language	English
dc.language.iso	EN	en_US
dc.publisher	UNSW, Sydney	en_US
dc.rights	CC BY-NC-ND 3.0	en_US
dc.rights.uri	https://creativecommons.org/licenses/by-nc-nd/3.0/au/	en_US
dc.subject.other	information retrieval	en_US
dc.subject.other	relation extraction	en_US
dc.subject.other	information extraction	en_US
dc.title	Sentence level relation extraction via relation embedding	en_US
dc.type	Thesis	en_US
dcterms.accessRights	open access
dcterms.rightsHolder	Huang, Haojie
dspace.entity.type	Publication	en_US
unsw.accessRights.uri	https://purl.org/coar/access_right/c_abf2
unsw.identifier.doi	https://doi.org/10.26190/unsworks/22705
unsw.relation.faculty	Engineering
unsw.relation.originalPublicationAffiliation	Huang, Haojie, School of Computer Science and Engineering, Engineering, UNSW	en_US
unsw.relation.school	School of Computer Science and Engineering	*
unsw.thesis.degreetype	PhD Doctorate	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: public version.pdf
Size:: 5.81 MB
Format:: application/pdf
Description:

Download

Resource type

Thesis

Publication: Sentence level relation extraction via relation embedding

Files

Original bundle

Resource type

Publication:

Sentence level relation extraction via relation embedding