A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

@article{Sahidullah2012ANW,
  title={A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition},
  author={Md. Sahidullah and Goutam Saha},
  journal={IEEE Signal Processing Letters},
  year={2012},
  volume={20},
  pages={149-152},
  url={https://api.semanticscholar.org/CorpusID:10900793}
}
A novel family of windowing technique to compute mel frequency cepstral coefficient (MFCC) for automatic speaker recognition from speech based on fundamental property of discrete time Fourier transform related to differentiation in frequency domain is proposed.

Figures and Tables from this paper

Speaker Verification System Using MFCC and DWT

A novel family of windowing technique is used to compute Mel Frequency Cepstral Coefficient for automatic speaker recognition from speech based on fundamental property of discrete time Fourier transform related to differentiation in frequency domain.

Novel windowing technique of MFCC for speaker identification with Modified Polynomial Classifiers

Experimental results show that the novel windowing technique with Modified Polynomial Classifier shows consistently better performance over hamming window, and there is an improvement in the accuracy of identification especially for large database with low memory usage.

Performance Analysis Of Speaker Identification System Using MFCC And DWT Under Various Noise Levels

A novel family of windowing technique is used to compute Mel Frequency Cepstral Coefficient (MFCC) and classical windowing scheme such as hamming window is modified to obtain derivatives of discrete time Fourier transform coefficients.

Speaker Recognition System Based On MFCC and DCT

An approach to the recognition of speech signal using frequency spectral information with Mel frequency is examined and the optimum values of above parameters are chosen to get an efficiency of 99.5 % over a very small length of audio file.

Efficient window for monolingual and crosslingual speaker identification using MFCC

Speaker identification system based on various windowing techniques based on mel-frequency cepstral coefficient shown to have considerably improved performance over baseline Hamming window technique.

Improving the performance of MFCC for Persian robust speech recognition

This paper introduces a noise robust new set of MFCC vector estimated through following steps and uses MLP neural network to evaluate the performance of proposed MFCC method and to classify the results.

Speaker identification based on normalized pitch frequency and Mel Frequency Cepstral Coefficients

Simulation results prove that the NPF as a feature in speaker identification enhances the performance of the speaker identification system, especially with the Discrete Cosine Transform (DCT) and wavelet denoising pre-processing step.

Combining dynamic features with MFCC for text-independent speaker identification

This paper presents text-independent speaker identification scheme based on the combination of dynamic features with Mel Frequency Cepstral Coefficients (MFCC), and compares the performance of speaker identification system using number of MFCC filters and centroids of vector quantization.

Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window) technique and two newly proposed windowing methods.

Feature Extraction Techniques in Speaker Recognition : A Review

This paper presents a brief survey on various feature extraction techniques like Linear Predictive Cepstral Coefficients (LPCC), Perceptual Linear Prediction Coefficients (PLPC), and Mel-Frequency…
...

Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification

Using the proposed formulas, the peak matched multitaper estimator is shown to have low mean square error (squared bias + variance) on speech-like processes and to perform slightly better in the NIST 2006 speaker verification task.

Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification

This paper provides detailed statistical analysis of MFCC bias and variance using autoregressive process simulations on the TIMIT corpus and proposes the multitaper method for MFCC extraction with a practical focus.

Significance of the Modified Group Delay Feature in Speech Recognition

The group delay function is modified to overcome the short-time spectral structure of speech owing to zeros that are close to the unit circle in the z-plane and also due to pitch periodicity effects and is called the modified group delay feature (MODGDF).

New efficient window function, replacement for the hamming window

A new simple window function is presented, which for the same window order (M), has a main-lobe width less than or equal to that of the Hamming window, while offering about 2-4.5-dB smaller peak side-lobes amplitude and is computationally efficient for signal spectrum analysis.

Speaker Identification and Verification by Combining MFCC and Phase Information

A phase information extraction method that normalizes the change variation in the phase according to the frame position of the input speech and combines the phase information with MFCCs in text-independent speaker identification and verification methods.

Speaker Verification Using Adapted Gaussian Mixture Models

The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.

Support vector machines using GMM supervectors for speaker verification

This work examines the idea of using the GMM supervector in a support vector machine (SVM) classifier and proposes two new SVM kernels based on distance metrics between GMM models that produce excellent classification accuracy in a NIST speaker recognition evaluation task.

SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation

A support vector machine kernel is constructed using the GMM supervector and similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis are shown.

On the use of windows for harmonic analysis with the discrete Fourier transform

A comprehensive catalog of data windows along with their significant performance parameters from which the different windows can be compared is included, and an example demonstrates the use and value of windows to resolve closely spaced harmonic signals characterized by large differences in amplitude.