Computationally efficient frame-averaged FM feature extraction for speaker recognition

Access & Terms of Use
metadata only access
Abstract
Recently, subband frame-averaged frequency modulation (FM) as a complementary feature to amplitude-based features for several speech based classification problems including speaker recognition has shown promise. One problem with using FM extraction in practical implementations is computational complexity. Proposed is a computationally efficient method to estimate the frame-averaged FM component in a novel manner, using zero crossing counts and the zero crossing counts of the differentiated signal. FM components, extracted from subband speech signals using the proposed method, form a feature vector. Speaker recognition experiments conducted on the NIST 2008 telephone database show that the proposed method successfully augments mel frequency cepstrum coefficients (MFCCs) to improve performance, obtaining 17% relative reductions in equal error rates when compared with an MFCC-based system.
Persistent link to this record
DOI
Additional Link
Author(s)
Thiruvaran, T
Nosratighods, M
Ambikairajah, E
Epps, J
Supervisor(s)
Creator(s)
Editor(s)
Translator(s)
Curator(s)
Designer(s)
Arranger(s)
Composer(s)
Recordist(s)
Conference Proceedings Editor(s)
Other Contributor(s)
Corporate/Industry Contributor(s)
Publication Year
2009
Resource Type
Journal Article
Degree Type
UNSW Faculty