Abstract
Recently, subband frame-averaged frequency modulation (FM) as a complementary feature to amplitude-based features for several speech based classification problems including speaker recognition has shown promise. One problem with using FM extraction in practical implementations is computational complexity. Proposed is a computationally efficient method to estimate the frame-averaged FM component in a novel manner, using zero crossing counts and the zero crossing counts of the differentiated signal. FM components, extracted from subband speech signals using the proposed method, form a feature vector. Speaker recognition experiments conducted on the NIST 2008 telephone database show that the proposed method successfully augments mel frequency cepstrum coefficients (MFCCs) to improve performance, obtaining 17% relative reductions in equal error rates when compared with an MFCC-based system.