Comparison of speech parameterization techniques for the classification of speech disfluencies

Authors: CHONG YEN FOOK, HARIHARAN MUTHUSAMY, LIM SIN CHEE, SAZALI BIN YAACOB, ABDUL HAMID BIN ADOM

Abstract: Stuttering assessment through the manual classification of speech disfluencies is subjective, inconsistent, time-consuming, and prone to error. The aim of this paper is to compare the effectiveness of the 3 speech feature extraction methods, mel-frequency cepstral coefficients, linear predictive coding (LPC)-based cepstral parameters, and perceptual linear predictive (PLP) analysis, for classifying 2 types of speech disfluencies, repetition and prolongation, from recorded disfluent speech samples. Three different classifiers, the k-nearest neighbor classifier, linear discriminant analysis-based classifier, and support vector machine, are employed for the classification of speech disfluencies. Speech samples are taken from the University College London Archive of Stuttered Speech and stuttered events are identified through manual segmentation. A 10-fold cross-validation method is used for testing the reliability of the classifier results. The effect of the 2 parameters (LPC order and frame length) in the LPC- and PLP-based methods on the classification results is also investigated. The experimental results reveal that the proposed method can be used to help speech language pathologists in classifying speech disfluencies.

Keywords: Disfluent speech, mel-frequency cepstral coefficient, linear predictive coding, perceptual linear predictive analysis, support vector machine

Full Text: PDF