Comprehensive Phonological Analysis for Clinical Implication Using Self-Attention Based Grapheme to Phoneme Modeling Under Low-Resource Conditions

Puneet Bawa, Virender Kadyan, Muskaan Singh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Within the field of speech recognition, a significant obstacle occurs when faced with low-resource situations characterized by a scarcity of accessible speech data, which is also heterogeneous in nature. This becomes more challenging considering the aspect of clinical environments, where accurate transcription of speech is of utmost significance in the identification and management of speech and language disorders. The current manual methodologies used for the development of language models (LMs) and the recognition of speech often encounter difficulties in low-resource scenarios, exhibiting limited ability to adjust to the distinct speech patterns shown by diverse demographics. The present study aims to tackle a significant issue within the field of voice recognition by proposing a solution centered on the advancement of automated language modeling. Specifically, the study highlights the importance of n-gram LMs in this context. The study sheds light on an innovative method that utilizes automated language model development using the multi-head self-attention transformer-based Grapheme-to-Phoneme (G2P) modeling. The results clearly indicate that automated language models outperform humanly created alternatives, highlighting their impressive adaptability and dependability. Furthermore, this research investigates the potential for metamorphosis offered by n-gram language models, resulting in a notable increase in recognition accuracy for the speech recognition system based on the Deep Neural Network-Hidden Markov Model (DNN-HMM).
Original languageEnglish
Title of host publication2023 31st Irish Conference on Artificial Intelligence and Cognitive Science (AICS)
PublisherIEEE
ISBN (Electronic)979-8-3503-6021-9
ISBN (Print)979-8-3503-6022-6
DOIs
Publication statusPublished (in print/issue) - 20 Mar 2024
Event2023 31st Irish Conference on Artificial Intelligence and Cognitive Science (AICS) - Letterkenny, Ireland
Duration: 7 Dec 20238 Dec 2023
Conference number: 2023

Publication series

Name2023 31st Irish Conference on Artificial Intelligence and Cognitive Science, AICS 2023

Conference

Conference2023 31st Irish Conference on Artificial Intelligence and Cognitive Science (AICS)
Abbreviated titleAICS
Country/TerritoryIreland
CityLetterkenny
Period7/12/238/12/23

Bibliographical note

Publisher Copyright:
© 2023 IEEE.

Keywords

  • Adaptation models
  • Sociology
  • Refining
  • Speech recognition
  • Medical services
  • Manuals
  • Linguistics
  • Clinical Applications
  • G2P Modeling
  • Language Modeling

Fingerprint

Dive into the research topics of 'Comprehensive Phonological Analysis for Clinical Implication Using Self-Attention Based Grapheme to Phoneme Modeling Under Low-Resource Conditions'. Together they form a unique fingerprint.

Cite this