Abstract
With the large volume of text available online it is becoming impractical to use supervised machine learning methods that require a sizeable training set of labelled data. In this paper we introduced a new sentiment-topic model called the hybrid sentiment-topic model (HST). The HST model is a completely unsupervised sentiment classification method that allows for the topical context of words in documents to be accounted for when classifying sentiment. The only input needed for the model is a list of positive seed words, a list of negative seed words, and the number of topics. The HST model differs from similar models as it ensures that each objective topic discovered has both a positive sentiment-topic and negative sentiment-topic associated with it; other similar models do not guarantee symmetric sentiment-topics. The HST model performs three functions, firstly, it discovers objective topics in a corpus of text; secondly, it finds a positive and negative sentiment- topic for each objective topic; and finally, it performs sentiment classification. The HST model is tested using a dataset consisting of movie reviews and a dataset of social media posts. For each dataset a variety of seed word lists and different numbers of topics are tested; the HST model is then compared against similar sentiment-topic models. In all experiments conducted, the HST model was found to outperform similar sentiment-topic models in terms of classification accuracy by a noticeable margin. Additionally, the HST model was found to converge faster than similar models and during the generative process the accuracy was found to be more stable.
Original language | English |
---|---|
Title of host publication | Unknown Host Publication |
Publisher | IEEE |
Number of pages | 8 |
Publication status | Accepted/In press - 29 Aug 2017 |
Event | The IEEE International Conference on Tools with Artificial Intelligence (ICTAI) 2017 - Duration: 29 Aug 2017 → … |
Conference
Conference | The IEEE International Conference on Tools with Artificial Intelligence (ICTAI) 2017 |
---|---|
Period | 29/08/17 → … |
Keywords
- Unsupervised machine learning
- Sentiment Classification
- Topic Model