Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review

Noel Zacarias-Morales, Pablo Pancardo, José Adán Hernández-Nolasco, Matias Garcia-Constantino

Research output: Contribution to journalArticlepeer-review

15 Citations (Scopus)
230 Downloads (Pure)

Abstract

Artificial Neural Networks (ANNs) were created inspired by the neural networks in the human brain and have been widely applied in speech processing. The application areas of ANN include: Speech recognition, speech emotion recognition, language identification, speech enhancement, and speech separation, amongst others. Likewise, given that speech processing performed by humans involves complex cognitive processes known as auditory attention, there has been a growing amount of papers proposing ANNs supported by deep learning algorithms in conjunction with some mechanism to achieve symmetry with the human attention process. However, while these ANN approaches include attention, there is no categorization of attention integrated into the deep learning algorithms and their relation with human auditory attention. Therefore, we consider it necessary to have a review of the different ANN approaches inspired in attention to show both academic and industry experts the available models for a wide variety of applications. Based on the PRISMA methodology, we present a systematic review of the literature published since 2000, in which deep learning algorithms are applied to diverse problems related to speech processing. In this paper 133 research works are selected and the following aspects are described: (i) Most relevant features, (ii) ways in which attention has been implemented, (iii) their hypothetical relationship with human attention, and (iv) the evaluation metrics used. Additionally, the four publications most related with human attention were analyzed and their strengths and weaknesses were determined.

Original languageEnglish
Article number214
JournalSymmetry
Volume13
Issue number2
DOIs
Publication statusPublished (in print/issue) - 28 Jan 2021

Bibliographical note

Funding Information:
Funding: This research was partially funded by CONACYT.

Funding Information:
Acknowledgments: We want to express our gratitude to the Consejo Nacional de Ciencia y Tecnologia (CONACyT) and the Juarez Autonomous University of Tabasco (UJAT) to support us with the necessary academic resources for this research.

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.

Copyright:
Copyright 2021 Elsevier B.V., All rights reserved.

Keywords

  • Artificial neural networks
  • Attention
  • Deep learning
  • Speech
  • Systematic review

Fingerprint

Dive into the research topics of 'Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review'. Together they form a unique fingerprint.

Cite this