The Identification of Authors using Cross Document Co-Referencing

Download files
Access & Terms of Use
open access
Copyright: Kernot, David
Altmetric
Abstract
One of the major problems facing the Australian war-fighter is the change from conventional, nation-state conflict to that of asymmetric warfare. In this type of warfare, insurgents dress and appear the same as the civilian population and improvised explosive devices (IEDs) are used in place of conventional weaponry. In order to support the war fighter, identifying the networks of the insurgents and their supporters becomes important. Author identification methods are used to identify insurgents and their supporters from masses of data (reports, web sites, etc.). This research project applies Neuro-Linguistic Programming techniques to extract key word phrase and gender-based pronouns for author identification. Results demonstrate that this technique has merit in the identification of particular authors through analysis of their specific use of key language elements in their publications. Future research will explore the automation of this technique. Three experiments were conducted using logistic regression, disciminant analysis, and exploratory and confirmatory factor analysis across the 30 observations in the sample. In the first experiment logistic regression was used to test the gender of an author through their use of pronouns and found that it might be possible to determine gender based on the use of the words ‘my’, ‘her’, and ‘its’. In the second experiment, discriminant analysis was used to test the sensory-based style of an author through their use of predicates and while unable to identify any Preferred Representational System, it was possible to identify Representational Systems within an author’s work. In the third experiment,, exploratory and confirmatory factor analysis was used to see if any underlying factors that might have influenced the observed results could be identified. While there were not enough observations in each of the categories to identify any underlying factors, overall the results were sufficient to develop two rule-based algorithms and identify an author by their sensory-based style and gender.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Kernot, David
Supervisor(s)
Lewis, Edward
Stocker, Robert
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2013
Resource Type
Thesis
Degree Type
Masters Thesis
UNSW Faculty
Files
download whole.pdf 1.09 MB Adobe Portable Document Format
Related dataset(s)