Abstract
The prevalence of multilabel aggressive text content on social media has a detrimental societal impact attracting the attention of government agencies and tech corporations to undertake measures against the spread of it. Hitherto research has focused on high-resource languages like English, leaving low-resource languages like Bengali out of the spotlight. This work presents a transformer-based technique to classify multilabel aggressive texts in Bengali into their targets to aid research in this area. A dataset (EM-BAD) containing 13728 texts is developed into five target classes: Religious Aggression (ReAG), Political Aggression (PoAG), Verbal Aggression (VeAG), Gender Aggression (GeAG), and Racial Aggression (RaAG) to perform the aggressive texts classification. Experimental results demonstrate that the Bangla-BERT with adjusted pooling layer and fine-tuning outdoes all ML, DL, and transformer-base baselines and existing techniques. The Bangla-BERT shows the highest weighted f1-score of 0.89 in the multilabel aggressive text classification task.
| Original language | English |
|---|---|
| Title of host publication | 2023 26th International Conference on Computer and Information Technology (ICCIT) |
| Publisher | IEEE |
| Pages | 1-6 |
| Number of pages | 6 |
| ISBN (Electronic) | 979-8-3503-5901-5 |
| ISBN (Print) | 979-8-3503-5902-2 |
| DOIs | |
| Publication status | Published online - 27 Feb 2024 |
Publication series
| Name | 2023 26th International Conference on Computer and Information Technology (ICCIT) |
|---|---|
| Publisher | IEEE Control Society |
Bibliographical note
Publisher Copyright:© 2023 IEEE.
Keywords
- Natural language processing
- Aggressive text classification
- Deep learning
- Text processing
- Text corpora
Fingerprint
Dive into the research topics of 'Multilabel Aggressive Text Classification from Social Media using Transformer-based Approaches'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver