A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

Md. Sahidullah; G. Saha

DOI:10.1109/LSP.2012.2235067
Corpus ID: 10900793

A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

@article{Sahidullah2012ANW,
  title={A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition},
  author={Md. Sahidullah and Goutam Saha},
  journal={IEEE Signal Processing Letters},
  year={2012},
  volume={20},
  pages={149-152},
  url={https://api.semanticscholar.org/CorpusID:10900793}
}

Md. SahidullahG. Saha
Published in IEEE Signal Processing… 11 June 2012
Computer Science

A novel family of windowing technique to compute mel frequency cepstral coefficient (MFCC) for automatic speaker recognition from speech based on fundamental property of discrete time Fourier transform related to differentiation in frequency domain is proposed.

[PDF] Semantic Reader

81 Citations

Highly Influential Citations

Background Citations

Methods Citations

Results Citations

Figures and Tables from this paper

Topics

Mel-frequency Cepstral Coefficients Automatic Speaker Recognition Speaker Recognition Systems Speaker Recognition

Speaker Verification System Using MFCC and DWT

V. SabithaProf. Janardhanan.P

Computer Science

2013

A novel family of windowing technique is used to compute Mel Frequency Cepstral Coefficient for automatic speaker recognition from speech based on fundamental property of discrete time Fourier transform related to differentiation in frequency domain.

Novel windowing technique of MFCC for speaker identification with Modified Polynomial Classifiers

A. BakshiSunil Kumar KopparapuSanjay PawarS. Nema

Computer Science

2014 5th International Conference - Confluence…

2014

Experimental results show that the novel windowing technique with Modified Polynomial Classifier shows consistently better performance over hamming window, and there is an improvement in the accuracy of identification especially for large database with low memory usage.

Performance Analysis Of Speaker Identification System Using MFCC And DWT Under Various Noise Levels

SabithaP. Janardhanan

Computer Science

2013

A novel family of windowing technique is used to compute Mel Frequency Cepstral Coefficient (MFCC) and classical windowing scheme such as hamming window is modified to obtain derivatives of discrete time Fourier transform coefficients.

Speaker Recognition System Based On MFCC and DCT

Garima VyasB. Kumari

Computer Science

2013

An approach to the recognition of speech signal using frequency spectral information with Mel frequency is examined and the optimum values of above parameters are chosen to get an efficiency of 99.5 % over a very small length of audio file.

Efficient window for monolingual and crosslingual speaker identification using MFCC

B. NagarajaH. S. Jayanna

Computer Science

2013 International Conference on Advanced…

2013

Speaker identification system based on various windowing techniques based on mel-frequency cepstral coefficient shown to have considerably improved performance over baseline Hamming window technique.

Improving the performance of MFCC for Persian robust speech recognition

D. DarabianH. MarviM. S. Noughabi

Computer Science

2015

This paper introduces a noise robust new set of MFCC vector estimated through following steps and uses MLP neural network to evaluate the performance of proposed MFCC method and to classify the results.

[PDF]

Speaker identification based on normalized pitch frequency and Mel Frequency Cepstral Coefficients

Marwa A. NasrM. Abd-ElnabyA. El-FishawySayed El-RabaieF. El-Samie

Computer Science

International Journal of Speech Technology

2018

Simulation results prove that the NPF as a feature in speaker identification enhances the performance of the speaker identification system, especially with the Discrete Cosine Transform (DCT) and wavelet denoising pre-processing step.

Combining dynamic features with MFCC for text-independent speaker identification

Amol A. ChaudhariA. RahulkarS. Dhonde

Computer Science

2015 International Conference on Information…

2015

This paper presents text-independent speaker identification scheme based on the combination of dynamic features with Mel Frequency Cepstral Coefficients (MFCC), and compares the performance of speaker identification system using number of MFCC filters and centroids of vector quantization.

Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

Ömer EskidereA. Gürhanli

Computer Science

Comput. Math. Methods Medicine

2015

The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window) technique and two newly proposed windowing methods.

[PDF]

Feature Extraction Techniques in Speaker Recognition : A Review

.. B. DhondeS. Jagade

Computer Science

2015

This paper presents a brief survey on various feature extraction techniques like Linear Predictive Cepstral Coefficients (LPCC), Perceptual Linear Prediction Coefficients (PLPC), and Mel-Frequency…

Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification

J. SandbergM. HanssonT. KinnunenR. SaeidiP. FlandrinP. Borgnat

Computer Science

IEEE Signal Processing Letters

2010

Using the proposed formulas, the peak matched multitaper estimator is shown to have low mean square error (squared bias + variance) on speech-like processes and to perform slightly better in the NIST 2006 speaker verification task.

Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition

Md. SahidullahG. Saha

Computer Science

Speech Commun.

2012

Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification

T. KinnunenR. Saeidi Haizhou Li

Computer Science

IEEE Transactions on Audio, Speech, and Language…

2012

This paper provides detailed statistical analysis of MFCC bias and variance using autoregressive process simulations on the TIMIT corpus and proposes the multitaper method for MFCC extraction with a practical focus.

Significance of the Modified Group Delay Feature in Speech Recognition

R. HegdeH. MurthyV. R. Gadde

Computer Science

IEEE Transactions on Audio, Speech, and Language…

2007

The group delay function is modified to overcome the short-time spectral structure of speech owing to zeros that are close to the unit circle in the z-plane and also due to pitch periodicity effects and is called the modified group delay feature (MODGDF).

New efficient window function, replacement for the hamming window

M. Mottaghi-KashtibanM. Shayesteh

Engineering, Computer Science

2011

A new simple window function is presented, which for the same window order (M), has a main-lobe width less than or equal to that of the Hamming window, while offering about 2-4.5-dB smaller peak side-lobes amplitude and is computationally efficient for signal spectrum analysis.

Speaker Identification and Verification by Combining MFCC and Phase Information

S. NakagawaLongbiao WangShinji Ohtsuka

Computer Science

IEEE Transactions on Audio, Speech, and Language…

2012

A phase information extraction method that normalizes the change variation in the phase according to the frame position of the input speech and combines the phase information with MFCCs in text-independent speaker identification and verification methods.

Speaker Verification Using Adapted Gaussian Mixture Models

D. ReynoldsT. QuatieriR. B. Dunn

Computer Science

Digit. Signal Process.

2000

The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.

Support vector machines using GMM supervectors for speaker verification

W. CampbellD. SturimD. Reynolds

Computer Science

IEEE Signal Processing Letters

2006

This work examines the idea of using the GMM supervector in a support vector machine (SVM) classifier and proposes two new SVM kernels based on distance metrics between GMM models that produce excellent classification accuracy in a NIST speaker recognition evaluation task.

SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation

W. CampbellD. SturimD. ReynoldsA. Solomonoff

Computer Science

2006 IEEE International Conference on Acoustics…

2006

A support vector machine kernel is constructed using the GMM supervector and similarities based on this kernel between the method of SVM nuisance attribute projection (NAP) and the recent results in latent factor analysis are shown.

On the use of windows for harmonic analysis with the discrete Fourier transform

F. Harris

Physics, Engineering

Proceedings of the IEEE

1978

A comprehensive catalog of data windows along with their significant performance parameters from which the different windows can be compared is included, and an example demonstrates the use and value of windows to resolve closely spaced harmonic signals characterized by large differences in amplitude.

A Novel Windowing Technique for Efficient Computation of MFCC for Speaker Recognition

Figures and Tables from this paper

Topics

81 Citations

Speaker Verification System Using MFCC and DWT

Novel windowing technique of MFCC for speaker identification with Modified Polynomial Classifiers

Performance Analysis Of Speaker Identification System Using MFCC And DWT Under Various Noise Levels

Speaker Recognition System Based On MFCC and DCT

Efficient window for monolingual and crosslingual speaker identification using MFCC

Improving the performance of MFCC for Persian robust speech recognition

Speaker identification based on normalized pitch frequency and Mel Frequency Cepstral Coefficients

Combining dynamic features with MFCC for text-independent speaker identification

Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features

Feature Extraction Techniques in Speaker Recognition : A Review

16 References

Multitaper Estimation of Frequency-Warped Cepstra With Application to Speaker Verification

Design, analysis and experimental evaluation of block based transformation in MFCC computation for speaker recognition

Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification

Significance of the Modified Group Delay Feature in Speech Recognition

New efficient window function, replacement for the hamming window

Speaker Identification and Verification by Combining MFCC and Phase Information

Speaker Verification Using Adapted Gaussian Mixture Models

Support vector machines using GMM supervectors for speaker verification

SVM Based Speaker Verification using a GMM Supervector Kernel and NAP Variability Compensation

On the use of windows for harmonic analysis with the discrete Fourier transform

Related Papers