Abstract
Speech emotion recognition (SER) is considered a pivotal area of research that holds significant importance in a variety of real-time applications, such as assessing human behavior and analyzing the emotional states of speakers in emergency situations. This paper assesses the capabilities of deep convolutional neural networks (CNNs) in this context. Both CNNs and Long Short-Term Memory (LSTM) based deep neural networks are evaluated for voice emotion identification. In our empirical evaluation, we utilize the Toronto Emotional Speech Set (TESS) database, which comprises speech samples from both young and old individuals, encompassing seven distinct emotions: anger, happiness, sadness, fear, surprise, disgust, and neutrality. To augment the dataset, variations in voice are introduced along with the addition of white noise. The empirical findings indicate that the CNN model outperforms existing studies on SER using the TESS corpus, yielding a noteworthy 21% improvement in average recognition accuracy. This work underscores SER’s significance and highlights the transformative potential of deep CNNs for enhancing its effectiveness in real-time applications, particularly in high-stakes emergency situations.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2024 9th International Conference on Machine Learning Technologies |
Pages | 95-100 |
Number of pages | 6 |
ISBN (Electronic) | 9798400716379 |
DOIs | |
Publication status | Published (in print/issue) - 11 Sept 2024 |
Event | 2024 9th International Conference on Machine Learning Technologies - Oslo, Norway Duration: 24 May 2024 → 26 May 2024 https://www.icmlt.org/ |
Publication series
Name | 2024 9th International Conference on Machine Learning Technologies (ICMLT) |
---|---|
Publisher | Association for Computing Machinery |
Conference
Conference | 2024 9th International Conference on Machine Learning Technologies |
---|---|
Abbreviated title | ICMLT 2024 |
Country/Territory | Norway |
City | Oslo |
Period | 24/05/24 → 26/05/24 |
Internet address |
Bibliographical note
Publisher Copyright:© 2024 Owner/Author.
Keywords
- Speech corpus
- Human speech emotion recognition
- Convolutional neural network applications
- Long short-term memory neural networks