KSU rich Arabic speech database

Mansour Alsulaiman, Ghulam Muhammad, Mohamed A. Bencherif, Awais Mahmood, Zulflqar Ali

Research output: Contribution to journalArticle

12 Citations (Scopus)

Abstract

Arabic is one of the major languages in the world. Unfortunately not so much research in Arabic speaker recognition has been done. One main reason for this lack of research is the unavailability of rich Arabic speech databases. In this paper, we present a rich and comprehensive Arabic speech database that we developed for the Arabic speaker/speech recognition research and/or applications. The database is rich in different aspects: (a) it has 257 speakers; (b) the speakers are from different ethnic groups: Saudis, Arabs, and non-Arabs; (c) utterances are both read text and spontaneous; (d) scripts are of different dimensions, such as, isolated words, digits, phonetically rich words, sentences, phonetically balanced sentences, paragraphs, etc.; (e) different sets of microphones with medium and high quality; (f) telephony and non-telephony speech; (g) three different recording environments: office, sound proof room, and cafeteria; (h) three diiferent sessions, where the recording sessions are scheduled at least with 2 weeks interval. Because of the richness of this database, it can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker/speech recognition, speech analysis, accent identification, ethnic groups/nationality recognition, etc. The richness of the database makes it a valuable resource for research in Arabic speech processing in particular and for research in speech processing in general. The database was carefully manually verified. The manual verification was complemented with automatic verification. Validation was performed on a subset of the database where the recognition rate reached 100% for Saudi speakers and 96% for non-Saudi speakers by using a system with 12 Mel frequency Cepstral coefficients, and 32 Gaussian mixtures.

LanguageEnglish
Pages4231-4253
Number of pages23
JournalInformation (Japan)
Volume16
Issue number6 B
Publication statusPublished - Jun 2013

Fingerprint

Speech processing
Speech recognition
Speech analysis
Microphones
Acoustic waves

Keywords

  • Arabic speech database
  • Phonetically
  • Rich database
  • Speaker recognition
  • Speech corpus

Cite this

Alsulaiman, M., Muhammad, G., Bencherif, M. A., Mahmood, A., & Ali, Z. (2013). KSU rich Arabic speech database. 16(6 B), 4231-4253.
Alsulaiman, Mansour ; Muhammad, Ghulam ; Bencherif, Mohamed A. ; Mahmood, Awais ; Ali, Zulflqar. / KSU rich Arabic speech database. 2013 ; Vol. 16, No. 6 B. pp. 4231-4253.
@article{f62bef2137f946a98214d765888d4090,
title = "KSU rich Arabic speech database",
abstract = "Arabic is one of the major languages in the world. Unfortunately not so much research in Arabic speaker recognition has been done. One main reason for this lack of research is the unavailability of rich Arabic speech databases. In this paper, we present a rich and comprehensive Arabic speech database that we developed for the Arabic speaker/speech recognition research and/or applications. The database is rich in different aspects: (a) it has 257 speakers; (b) the speakers are from different ethnic groups: Saudis, Arabs, and non-Arabs; (c) utterances are both read text and spontaneous; (d) scripts are of different dimensions, such as, isolated words, digits, phonetically rich words, sentences, phonetically balanced sentences, paragraphs, etc.; (e) different sets of microphones with medium and high quality; (f) telephony and non-telephony speech; (g) three different recording environments: office, sound proof room, and cafeteria; (h) three diiferent sessions, where the recording sessions are scheduled at least with 2 weeks interval. Because of the richness of this database, it can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker/speech recognition, speech analysis, accent identification, ethnic groups/nationality recognition, etc. The richness of the database makes it a valuable resource for research in Arabic speech processing in particular and for research in speech processing in general. The database was carefully manually verified. The manual verification was complemented with automatic verification. Validation was performed on a subset of the database where the recognition rate reached 100{\%} for Saudi speakers and 96{\%} for non-Saudi speakers by using a system with 12 Mel frequency Cepstral coefficients, and 32 Gaussian mixtures.",
keywords = "Arabic speech database, Phonetically, Rich database, Speaker recognition, Speech corpus",
author = "Mansour Alsulaiman and Ghulam Muhammad and Bencherif, {Mohamed A.} and Awais Mahmood and Zulflqar Ali",
year = "2013",
month = "6",
language = "English",
volume = "16",
pages = "4231--4253",
number = "6 B",

}

Alsulaiman, M, Muhammad, G, Bencherif, MA, Mahmood, A & Ali, Z 2013, 'KSU rich Arabic speech database', vol. 16, no. 6 B, pp. 4231-4253.

KSU rich Arabic speech database. / Alsulaiman, Mansour; Muhammad, Ghulam; Bencherif, Mohamed A.; Mahmood, Awais; Ali, Zulflqar.

Vol. 16, No. 6 B, 06.2013, p. 4231-4253.

Research output: Contribution to journalArticle

TY - JOUR

T1 - KSU rich Arabic speech database

AU - Alsulaiman, Mansour

AU - Muhammad, Ghulam

AU - Bencherif, Mohamed A.

AU - Mahmood, Awais

AU - Ali, Zulflqar

PY - 2013/6

Y1 - 2013/6

N2 - Arabic is one of the major languages in the world. Unfortunately not so much research in Arabic speaker recognition has been done. One main reason for this lack of research is the unavailability of rich Arabic speech databases. In this paper, we present a rich and comprehensive Arabic speech database that we developed for the Arabic speaker/speech recognition research and/or applications. The database is rich in different aspects: (a) it has 257 speakers; (b) the speakers are from different ethnic groups: Saudis, Arabs, and non-Arabs; (c) utterances are both read text and spontaneous; (d) scripts are of different dimensions, such as, isolated words, digits, phonetically rich words, sentences, phonetically balanced sentences, paragraphs, etc.; (e) different sets of microphones with medium and high quality; (f) telephony and non-telephony speech; (g) three different recording environments: office, sound proof room, and cafeteria; (h) three diiferent sessions, where the recording sessions are scheduled at least with 2 weeks interval. Because of the richness of this database, it can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker/speech recognition, speech analysis, accent identification, ethnic groups/nationality recognition, etc. The richness of the database makes it a valuable resource for research in Arabic speech processing in particular and for research in speech processing in general. The database was carefully manually verified. The manual verification was complemented with automatic verification. Validation was performed on a subset of the database where the recognition rate reached 100% for Saudi speakers and 96% for non-Saudi speakers by using a system with 12 Mel frequency Cepstral coefficients, and 32 Gaussian mixtures.

AB - Arabic is one of the major languages in the world. Unfortunately not so much research in Arabic speaker recognition has been done. One main reason for this lack of research is the unavailability of rich Arabic speech databases. In this paper, we present a rich and comprehensive Arabic speech database that we developed for the Arabic speaker/speech recognition research and/or applications. The database is rich in different aspects: (a) it has 257 speakers; (b) the speakers are from different ethnic groups: Saudis, Arabs, and non-Arabs; (c) utterances are both read text and spontaneous; (d) scripts are of different dimensions, such as, isolated words, digits, phonetically rich words, sentences, phonetically balanced sentences, paragraphs, etc.; (e) different sets of microphones with medium and high quality; (f) telephony and non-telephony speech; (g) three different recording environments: office, sound proof room, and cafeteria; (h) three diiferent sessions, where the recording sessions are scheduled at least with 2 weeks interval. Because of the richness of this database, it can be used in many Arabic, and non-Arabic, speech processing researches, such as speaker/speech recognition, speech analysis, accent identification, ethnic groups/nationality recognition, etc. The richness of the database makes it a valuable resource for research in Arabic speech processing in particular and for research in speech processing in general. The database was carefully manually verified. The manual verification was complemented with automatic verification. Validation was performed on a subset of the database where the recognition rate reached 100% for Saudi speakers and 96% for non-Saudi speakers by using a system with 12 Mel frequency Cepstral coefficients, and 32 Gaussian mixtures.

KW - Arabic speech database

KW - Phonetically

KW - Rich database

KW - Speaker recognition

KW - Speech corpus

UR - http://www.scopus.com/inward/record.url?scp=84881093026&partnerID=8YFLogxK

M3 - Article

VL - 16

SP - 4231

EP - 4253

IS - 6 B

ER -

Alsulaiman M, Muhammad G, Bencherif MA, Mahmood A, Ali Z. KSU rich Arabic speech database. 2013 Jun;16(6 B):4231-4253.