Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model

Zulfiqar Ali, Irraivan Elamvazuthi, Mansour Alsulaiman, Ghulam Muhammad

Research output: Contribution to journalArticle

14 Citations (Scopus)

Abstract

Background and Objective 

Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module. 

Method 

A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision. Results In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech–based systems. Discussion The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

LanguageEnglish
Pages757.e7-757.e19
Number of pages13
JournalJournal of Voice
Volume30
Issue number6
Early online date27 Oct 2015
DOIs
Publication statusPublished - 30 Nov 2016

Fingerprint

Pathology
Hearing
Speech-Language Pathology
Psychophysics
Ear
Databases
Research

Keywords

  • All-pole model
  • Auditory spectrum
  • GMM
  • Running speech
  • Voice pathology classification
  • Voice pathology detection

Cite this

Ali, Zulfiqar ; Elamvazuthi, Irraivan ; Alsulaiman, Mansour ; Muhammad, Ghulam. / Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model. 2016 ; Vol. 30, No. 6. pp. 757.e7-757.e19.
@article{d34de4842b8e4c8480b45aa462fcbb47,
title = "Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model",
abstract = "Background and Objective Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module. Method A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision. Results In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56{\%} is obtained for pathology detection, and an ACC of 93.33{\%} is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech–based systems. Discussion The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.",
keywords = "All-pole model, Auditory spectrum, GMM, Running speech, Voice pathology classification, Voice pathology detection",
author = "Zulfiqar Ali and Irraivan Elamvazuthi and Mansour Alsulaiman and Ghulam Muhammad",
year = "2016",
month = "11",
day = "30",
doi = "10.1016/j.jvoice.2015.08.010",
language = "English",
volume = "30",
pages = "757.e7--757.e19",
number = "6",

}

Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model. / Ali, Zulfiqar; Elamvazuthi, Irraivan; Alsulaiman, Mansour; Muhammad, Ghulam.

Vol. 30, No. 6, 30.11.2016, p. 757.e7-757.e19.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Automatic Voice Pathology Detection With Running Speech by Using Estimation of Auditory Spectrum and Cepstral Coefficients Based on the All-Pole Model

AU - Ali, Zulfiqar

AU - Elamvazuthi, Irraivan

AU - Alsulaiman, Mansour

AU - Muhammad, Ghulam

PY - 2016/11/30

Y1 - 2016/11/30

N2 - Background and Objective Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module. Method A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision. Results In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech–based systems. Discussion The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

AB - Background and Objective Automatic voice pathology detection using sustained vowels has been widely explored. Because of the stationary nature of the speech waveform, pathology detection with a sustained vowel is a comparatively easier task than that using a running speech. Some disorder detection systems with running speech have also been developed, although most of them are based on a voice activity detection (VAD), that is, itself a challenging task. Pathology detection with running speech needs more investigation, and systems with good accuracy (ACC) are required. Furthermore, pathology classification systems with running speech have not received any attention from the research community. In this article, automatic pathology detection and classification systems are developed using text-dependent running speech without adding a VAD module. Method A set of three psychophysics conditions of hearing (critical band spectral estimation, equal loudness hearing curve, and the intensity loudness power law of hearing) is used to estimate the auditory spectrum. The auditory spectrum and all-pole models of the auditory spectrums are computed and analyzed and used in a Gaussian mixture model for an automatic decision. Results In the experiments using the Massachusetts Eye & Ear Infirmary database, an ACC of 99.56% is obtained for pathology detection, and an ACC of 93.33% is obtained for the pathology classification system. The results of the proposed systems outperform the existing running-speech–based systems. Discussion The developed system can effectively be used in voice pathology detection and classification systems, and the proposed features can visually differentiate between normal and pathological samples.

KW - All-pole model

KW - Auditory spectrum

KW - GMM

KW - Running speech

KW - Voice pathology classification

KW - Voice pathology detection

UR - http://www.scopus.com/inward/record.url?scp=84949997124&partnerID=8YFLogxK

U2 - 10.1016/j.jvoice.2015.08.010

DO - 10.1016/j.jvoice.2015.08.010

M3 - Article

VL - 30

SP - 757.e7-757.e19

IS - 6

ER -