Enhancing Speech Emotion Recognition Using Deep Convolutional Neural Networks

M M Manjurul Islam, Md Alamgir Kabir, Alamin Sheikh, Muhammad Saiduzzaman, Abdelakram Hafid, Saad Abdullah

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

10 Downloads (Pure)

Abstract

Speech emotion recognition (SER) is considered a pivotal area of research that holds significant importance in a variety of real-time applications, such as assessing human behavior and analyzing the emotional states of speakers in emergency situations. This paper assesses the capabilities of deep convolutional neural networks (CNNs) in this context. Both CNNs and Long Short-Term Memory (LSTM) based deep neural networks are evaluated for voice emotion identification. In our empirical evaluation, we utilize the Toronto Emotional Speech Set (TESS) database, which comprises speech samples from both young and old individuals, encompassing seven distinct emotions: anger, happiness, sadness, fear, surprise, disgust, and neutrality. To augment the dataset, variations in voice are introduced along with the addition of white noise. The empirical findings indicate that the CNN model outperforms existing studies on SER using the TESS corpus, yielding a noteworthy 21% improvement in average recognition accuracy. This work underscores SER’s significance and highlights the transformative potential of deep CNNs for enhancing its effectiveness in real-time applications, particularly in high-stakes emergency situations.
Original languageEnglish
Title of host publicationProceedings of the 2024 9th International Conference on Machine Learning Technologies
Pages95-100
Number of pages6
ISBN (Electronic)9798400716379
DOIs
Publication statusPublished (in print/issue) - 11 Sept 2024
Event2024 9th International Conference on Machine Learning Technologies - Oslo, Norway
Duration: 24 May 202426 May 2024
https://www.icmlt.org/

Publication series

Name2024 9th International Conference on Machine Learning Technologies (ICMLT)
PublisherAssociation for Computing Machinery

Conference

Conference2024 9th International Conference on Machine Learning Technologies
Abbreviated titleICMLT 2024
Country/TerritoryNorway
CityOslo
Period24/05/2426/05/24
Internet address

Bibliographical note

Publisher Copyright:
© 2024 Owner/Author.

Keywords

  • Speech corpus
  • Human speech emotion recognition
  • Convolutional neural network applications
  • Long short-term memory neural networks

Fingerprint

Dive into the research topics of 'Enhancing Speech Emotion Recognition Using Deep Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this