Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions

Ahmed Al-Nasheri, Ghulam Muhammad, Mansour Alsulaiman, Zulfiqar Ali, Khalid H. Malki, Tamer A. Mesallam, Mohamed Farahat Ibrahim

Research output: Contribution to journalArticle

32 Citations (Scopus)
167 Downloads (Pure)

Abstract

Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, enabling the early detection of voice pathologies and the diagnosis of the type of pathology from which patients suffer. This paper concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using autocorrelation and entropy. We extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using autocorrelation as features to detect and classify pathological samples. We also extracted the entropy for each frame of the voice signal after we normalized its values to be used as the features. These features were investigated in distinct frequency bands to assess the contribution of each band to the detection and classification processes. Various samples of the sustained vowel /a/ for both normal and pathological voices were extracted from three different databases in English, German, and Arabic. A support vector machine was used as a classifier. We also performed u-Tests to investigate if there is a significant difference between the means of the normal and pathological samples. The best achieved accuracies in both detection and classification varied depending on the used band, method, and database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. The highest obtained accuracies in the case of detection were 99.69%, 92.79%, and 99.79% for Massachusetts eye and ear infirmary (MEEI), Saarbrücken voice database (SVD), and Arabic voice pathology database (AVPD), respectively. However, the highest achieved accuracies for classification were 99.54%, 99.53%, and 96.02% for MEEI, SVD, and AVPD, correspondingly, using the combined feature.

Original languageEnglish
Pages (from-to)6961-6974
Number of pages14
JournalIEEE Access
Volume6
DOIs
Publication statusPublished - 20 Apr 2017

Fingerprint

Pathology
Autocorrelation
Entropy
Frequency bands
Support vector machines
Feature extraction
Classifiers

Keywords

  • Arabic voice pathology database (AVPD)
  • frequency investigation
  • Massachusetts eye and ear infirmary (MEEI)
  • Saarbrücken voice database (SVD)
  • Voice pathology detection and classification

Cite this

Al-Nasheri, A., Muhammad, G., Alsulaiman, M., Ali, Z., Malki, K. H., Mesallam, T. A., & Farahat Ibrahim, M. (2017). Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions. IEEE Access, 6, 6961-6974. https://doi.org/10.1109/ACCESS.2017.2696056
Al-Nasheri, Ahmed ; Muhammad, Ghulam ; Alsulaiman, Mansour ; Ali, Zulfiqar ; Malki, Khalid H. ; Mesallam, Tamer A. ; Farahat Ibrahim, Mohamed. / Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions. In: IEEE Access. 2017 ; Vol. 6. pp. 6961-6974.
@article{9803b357bf9a4ac298fe50a3ef6dfeb1,
title = "Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions",
abstract = "Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, enabling the early detection of voice pathologies and the diagnosis of the type of pathology from which patients suffer. This paper concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using autocorrelation and entropy. We extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using autocorrelation as features to detect and classify pathological samples. We also extracted the entropy for each frame of the voice signal after we normalized its values to be used as the features. These features were investigated in distinct frequency bands to assess the contribution of each band to the detection and classification processes. Various samples of the sustained vowel /a/ for both normal and pathological voices were extracted from three different databases in English, German, and Arabic. A support vector machine was used as a classifier. We also performed u-Tests to investigate if there is a significant difference between the means of the normal and pathological samples. The best achieved accuracies in both detection and classification varied depending on the used band, method, and database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. The highest obtained accuracies in the case of detection were 99.69{\%}, 92.79{\%}, and 99.79{\%} for Massachusetts eye and ear infirmary (MEEI), Saarbr{\"u}cken voice database (SVD), and Arabic voice pathology database (AVPD), respectively. However, the highest achieved accuracies for classification were 99.54{\%}, 99.53{\%}, and 96.02{\%} for MEEI, SVD, and AVPD, correspondingly, using the combined feature.",
keywords = "Arabic voice pathology database (AVPD), frequency investigation, Massachusetts eye and ear infirmary (MEEI), Saarbr{\"u}cken voice database (SVD), Voice pathology detection and classification",
author = "Ahmed Al-Nasheri and Ghulam Muhammad and Mansour Alsulaiman and Zulfiqar Ali and Malki, {Khalid H.} and Mesallam, {Tamer A.} and {Farahat Ibrahim}, Mohamed",
year = "2017",
month = "4",
day = "20",
doi = "10.1109/ACCESS.2017.2696056",
language = "English",
volume = "6",
pages = "6961--6974",
journal = "IEEE Access",
issn = "2169-3536",

}

Al-Nasheri, A, Muhammad, G, Alsulaiman, M, Ali, Z, Malki, KH, Mesallam, TA & Farahat Ibrahim, M 2017, 'Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions', IEEE Access, vol. 6, pp. 6961-6974. https://doi.org/10.1109/ACCESS.2017.2696056

Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions. / Al-Nasheri, Ahmed; Muhammad, Ghulam; Alsulaiman, Mansour; Ali, Zulfiqar; Malki, Khalid H.; Mesallam, Tamer A.; Farahat Ibrahim, Mohamed.

In: IEEE Access, Vol. 6, 20.04.2017, p. 6961-6974.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Voice Pathology Detection and Classification Using Auto-Correlation and Entropy Features in Different Frequency Regions

AU - Al-Nasheri, Ahmed

AU - Muhammad, Ghulam

AU - Alsulaiman, Mansour

AU - Ali, Zulfiqar

AU - Malki, Khalid H.

AU - Mesallam, Tamer A.

AU - Farahat Ibrahim, Mohamed

PY - 2017/4/20

Y1 - 2017/4/20

N2 - Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, enabling the early detection of voice pathologies and the diagnosis of the type of pathology from which patients suffer. This paper concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using autocorrelation and entropy. We extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using autocorrelation as features to detect and classify pathological samples. We also extracted the entropy for each frame of the voice signal after we normalized its values to be used as the features. These features were investigated in distinct frequency bands to assess the contribution of each band to the detection and classification processes. Various samples of the sustained vowel /a/ for both normal and pathological voices were extracted from three different databases in English, German, and Arabic. A support vector machine was used as a classifier. We also performed u-Tests to investigate if there is a significant difference between the means of the normal and pathological samples. The best achieved accuracies in both detection and classification varied depending on the used band, method, and database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. The highest obtained accuracies in the case of detection were 99.69%, 92.79%, and 99.79% for Massachusetts eye and ear infirmary (MEEI), Saarbrücken voice database (SVD), and Arabic voice pathology database (AVPD), respectively. However, the highest achieved accuracies for classification were 99.54%, 99.53%, and 96.02% for MEEI, SVD, and AVPD, correspondingly, using the combined feature.

AB - Automatic voice pathology detection and classification systems effectively contribute to the assessment of voice disorders, enabling the early detection of voice pathologies and the diagnosis of the type of pathology from which patients suffer. This paper concentrates on developing an accurate and robust feature extraction for detecting and classifying voice pathologies by investigating different frequency bands using autocorrelation and entropy. We extracted maximum peak values and their corresponding lag values from each frame of a voiced signal by using autocorrelation as features to detect and classify pathological samples. We also extracted the entropy for each frame of the voice signal after we normalized its values to be used as the features. These features were investigated in distinct frequency bands to assess the contribution of each band to the detection and classification processes. Various samples of the sustained vowel /a/ for both normal and pathological voices were extracted from three different databases in English, German, and Arabic. A support vector machine was used as a classifier. We also performed u-Tests to investigate if there is a significant difference between the means of the normal and pathological samples. The best achieved accuracies in both detection and classification varied depending on the used band, method, and database. The most contributive bands in both detection and classification were between 1000 and 8000 Hz. The highest obtained accuracies in the case of detection were 99.69%, 92.79%, and 99.79% for Massachusetts eye and ear infirmary (MEEI), Saarbrücken voice database (SVD), and Arabic voice pathology database (AVPD), respectively. However, the highest achieved accuracies for classification were 99.54%, 99.53%, and 96.02% for MEEI, SVD, and AVPD, correspondingly, using the combined feature.

KW - Arabic voice pathology database (AVPD)

KW - frequency investigation

KW - Massachusetts eye and ear infirmary (MEEI)

KW - Saarbrücken voice database (SVD)

KW - Voice pathology detection and classification

UR - http://www.scopus.com/inward/record.url?scp=85042525847&partnerID=8YFLogxK

U2 - 10.1109/ACCESS.2017.2696056

DO - 10.1109/ACCESS.2017.2696056

M3 - Article

AN - SCOPUS:85042525847

VL - 6

SP - 6961

EP - 6974

JO - IEEE Access

JF - IEEE Access

SN - 2169-3536

ER -