Sentiment mining and topic modelling are areas of research that have received significant attention in recent years. Sentiment mining refers to classifying a document as either positive or negative. Topic modelling is a method for discovering the thematic structure of a document collection. This thesis investigates ways of combining sentiment mining and topic modelling to allow for topical-context to be used to improve sentiment mining. Topic model ensembles are also investigated and shown to increase topic coherence. Firstly, the aggregated topic model is introduced. This is an ensemble-like method for topic models wherein multiple models with different parameters are generated and then the topics output from each model are merged using similarity metrics. This has the advantage of allowing a mix of specific and general topics from different models without data or algorithm manipulation. Experimental results show the aggregated topic model is more coherent than latent Dirichlet allocation and non-negative matrix factorisation in multiple domains. Next, the concept of using topic models as feature learners is introduced. A document’s topic distribution is used as an additional feature in classification along with the tokens from the document. This has the advantage of topical context being used in classification. Experimental results found that classifiers such as SVMs or maximum entropy are improved with the additional topic features. Small accuracy improvements were observed in social media datasets. Finally, the hybrid sentiment-topic model and non-parametric hybrid sentiment-topic model were introduced. These models are designed to combine sentiment classification and topic modelling with a focus on sentiment classification. The advantage they provide over similar methods is that they learn the number of topics from the data and therefore require no user input or knowledge of the data being modelled. Experiments in varied domains found that this method outperforms similar methods such as JST or ASUM.
Sentiment mining from social media content using topic model ensembles
Blair, S. J. (Author). Nov 2021
Student thesis: Doctoral Thesis