Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals

Zulfiqar Ali, Irraivan Elamvazuthi, Mansour Alsulaiman, Ghulam Muhammad

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1–1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 % is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 % and an area under receiver operating characteristic curve (AUC) of 95.06 % is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 % in accuracy and 1.45 % in AUC is observed.

LanguageEnglish
Article number20
Pages1-10
Number of pages10
JournalJournal of Medical Systems
Volume40
Issue number1
DOIs
Publication statusPublished - Jan 2016

Fingerprint

Phonation
Multiresolution analysis
Fractals
Pathology
Fractal dimension
Vibration
Voice Disorders
Area Under Curve
Vocal Cords
ROC Curve
Noise
Frequency bands
Power spectrum

Keywords

  • Fractal dimension
  • Higuchi algorithm
  • Katz algorithm
  • MDVP parameters
  • Voice pathology detection
  • Wavelet transformation

Cite this

Ali, Zulfiqar ; Elamvazuthi, Irraivan ; Alsulaiman, Mansour ; Muhammad, Ghulam. / Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals. In: Journal of Medical Systems. 2016 ; Vol. 40, No. 1. pp. 1-10.
@article{76ded470c9604a2e97ad121649e220e6,
title = "Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals",
abstract = "Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1–1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 {\%} is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 {\%} and an area under receiver operating characteristic curve (AUC) of 95.06 {\%} is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 {\%} in accuracy and 1.45 {\%} in AUC is observed.",
keywords = "Fractal dimension, Higuchi algorithm, Katz algorithm, MDVP parameters, Voice pathology detection, Wavelet transformation",
author = "Zulfiqar Ali and Irraivan Elamvazuthi and Mansour Alsulaiman and Ghulam Muhammad",
year = "2016",
month = "1",
doi = "10.1007/s10916-015-0392-2",
language = "English",
volume = "40",
pages = "1--10",
journal = "Journal of Medical Systems",
issn = "0148-5598",
number = "1",

}

Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals. / Ali, Zulfiqar; Elamvazuthi, Irraivan; Alsulaiman, Mansour; Muhammad, Ghulam.

In: Journal of Medical Systems, Vol. 40, No. 1, 20, 01.2016, p. 1-10.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Detection of Voice Pathology using Fractal Dimension in a Multiresolution Analysis of Normal and Disordered Speech Signals

AU - Ali, Zulfiqar

AU - Elamvazuthi, Irraivan

AU - Alsulaiman, Mansour

AU - Muhammad, Ghulam

PY - 2016/1

Y1 - 2016/1

N2 - Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1–1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 % is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 % and an area under receiver operating characteristic curve (AUC) of 95.06 % is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 % in accuracy and 1.45 % in AUC is observed.

AB - Voice disorders are associated with irregular vibrations of vocal folds. Based on the source filter theory of speech production, these irregular vibrations can be detected in a non-invasive way by analyzing the speech signal. In this paper we present a multiband approach for the detection of voice disorders given that the voice source generally interacts with the vocal tract in a non-linear way. In normal phonation, and assuming sustained phonation of a vowel, the lower frequencies of speech are heavily source dependent due to the low frequency glottal formant, while the higher frequencies are less dependent on the source signal. During abnormal phonation, this is still a valid, but turbulent noise of source, because of the irregular vibration, affects also higher frequencies. Motivated by such a model, we suggest a multiband approach based on a three-level discrete wavelet transformation (DWT) and in each band the fractal dimension (FD) of the estimated power spectrum is estimated. The experiments suggest that frequency band 1–1562 Hz, lower frequencies after level 3, exhibits a significant difference in the spectrum of a normal and pathological subject. With this band, a detection rate of 91.28 % is obtained with one feature, and the obtained result is higher than all other frequency bands. Moreover, an accuracy of 92.45 % and an area under receiver operating characteristic curve (AUC) of 95.06 % is acquired when the FD of all levels is fused. Likewise, when the FD of all levels is combined with 22 Multi-Dimensional Voice Program (MDVP) parameters, an improvement of 2.26 % in accuracy and 1.45 % in AUC is observed.

KW - Fractal dimension

KW - Higuchi algorithm

KW - Katz algorithm

KW - MDVP parameters

KW - Voice pathology detection

KW - Wavelet transformation

UR - http://www.scopus.com/inward/record.url?scp=84946047730&partnerID=8YFLogxK

U2 - 10.1007/s10916-015-0392-2

DO - 10.1007/s10916-015-0392-2

M3 - Article

VL - 40

SP - 1

EP - 10

JO - Journal of Medical Systems

T2 - Journal of Medical Systems

JF - Journal of Medical Systems

SN - 0148-5598

IS - 1

M1 - 20

ER -