Improving the utilization of training samples in visual recognition

Download files
Access & Terms of Use
open access
Copyright: Liu, Yingying
Altmetric
Abstract
Recognition is a fundamental computer vision problem, in which training samples are used to learn models, that then assign labels to test samples. The utilization of training samples is of vital importance to visual recognition, which can be addressed by increasing the capability of the description methods and the model learning methods. Two visual recognition tasks namely object detection and action recognition and are considered in this thesis. Active learning utilizes selected subsets of the training dataset as training samples. Active learning methods select the most informative training samples in each iteration, and therefore require fewer training samples to attain comparable performance to passive learning methods. In this thesis, an active learning method for object detection that exploits the distribution of training samples is presented. Experiments show that the proposed method outperforms a passive learning method and a simple margin active learning method. Weakly supervised learning facilitates learning on training samples with weak labels. In this thesis, a weakly supervised object detection method is proposed to utilize training samples with probabilistic labels. Base detectors are used to create object proposals from training samples with weak labels. Then the object proposals are assigned estimated probabilistic labels. A Generalized Hough Transform based object detector is extended to utilize the object proposals with probabilistic labels as training samples. The proposed method is shown to outperform both a comparison method that assigns strong labels to object proposals, and a weakly supervised deformable part-based models method. The proposed method also attains comparable performance to supervised learning methods. Increasing the capability of the description method can improve the utilization of training samples. In this thesis, temporal pyramid histograms are proposed to address the problem of missing temporal information in the classical bag of features description method used in action recognition. Experiments show that the proposed description method outperforms the classical bag of features method in action recognition.
Persistent link to this record
Link to Publisher Version
Link to Open Access Version
Additional Link
Author(s)
Liu, Yingying
Supervisor(s)
Sowmya, Arcot
Yang, Wang
Wei, Wang
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2016
Resource Type
Thesis
Degree Type
PhD Doctorate
UNSW Faculty
Files
download public version.pdf 14.54 MB Adobe Portable Document Format
Related dataset(s)