Feature learning, selection and representation for object detection and recognition

Download files
Access & Terms of Use
open access
Copyright: Li, Zelin
Altmetric
Abstract
The recent years have seen the increasing popularity of a wide range of applications in Computer Vision. Object Recognition is critical in determining the success of these applications. In general, object recognition consists of three major components: imagery acquisition, feature extraction and classification. Imagery acquisition is the process by which hardware is used to obtain information from a scene. Feature extraction is the process by which an object within the image is described either explicitly or implicitly. Finally, classification attempts to use the extracted features as an input to generate classifiers, thereby distinguishing different objects. Therefore, feature extraction and its derived representations play a vital role in improving the accuracy of object recognition. In traditional hand-crafted feature representation methods, such as Histogram of Gradient, the representative and discriminative capabilities of these methods are highly vulnerable to various factors, such as image noise. Therefore, in this study, novel feature descriptors are proposed to tackle the issues arising from feature representation in object recognition. Two kernel-based feature descriptors are developed: Steering Kernel Regression Weight Matrix and Long-Axis of Local Adaptive Steering Kernel; both of which are derived from Steering Kernel Regression. The proposed feature descriptors facilitate the ability of Steering Kernel Regression to capture the local structure around an object and tolerate the noise that occurs in the images. Subsequently covariance techniques are employed to form a robust feature representation for an object. To alleviate the complexity arising from covariance, Long-Axis of Local Adaptive Steering Kernel feature is proposed to simplify the computation required. Additionally, to further reduce computation complexity, a sparse representation based feature descriptor is developed for object detection. Sparse representation has proven its superior capabilities in image classification, but not in object detection, i.e. isolating objects within a complicated environment. Therefore, in this study, the reconstruction error from sparse representation is utilised as a feature, and a Bayesian model is applied to use learned atoms to reinforce the discriminative power of reconstruction error that may have been 'polluted' by noise. Computer Vision applications not only work with static images, but also with videos. A motion-based approach is developed to tackle the problem caused by appearance-based features in video-based Person Re-identification. Fine-grained displacement along dense trajectories is applied to describe a specific human motion. Such motion information encodes numerous distinctive biometric cues to discriminate between the walking styles of different people. Utilising Dissimilarity based Sparse Subset Learning, the proposed approach demonstrates increased robustness in long term Person Re-identification applications.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Li, Zelin
Supervisor(s)
Chen, Fang
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2017
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 9.03 MB Adobe Portable Document Format
Related dataset(s)