Aggregated Topic Models for Increasing Social Media Topic Coherence

Stuart Blair, Y Bi, Maurice Mulvenna

Research output: Contribution to journalArticlepeer-review

68 Citations (Scopus)
243 Downloads (Pure)

Abstract

This research presents a novel aggregating method for constructing an aggregated topic model that is composed of the topics with greater coherence than individual models. When generating a topic model, a number of parameters have to be specified. The resulting topics can be very general or very specific, which depend on the chosen parameters. In this study we investigate the process of aggregating multiple topic models generated using different parameters with a focus on whether combining the general and specific topics is able to increase topic coherence. We employ cosine similarity and Jensen-Shannon divergence to compute the similarity among topics and combine them into an aggregated model when their similarity scores exceed a predefined threshold. The model is evaluated against the standard topics models generated by the latent Dirichlet allocation and Non- negative Matrix Factorisation. Specifically we use the coherence of topics to compare the individual models that create aggregated models against those of the aggregated model and models generated by Non-negative Matrix Factorisation, respectively. The results demonstrate that the aggregated model outperforms those topic models at a statistically significant level in terms of topic coherence over an external corpus. We also make use of the aggregated topic model on social media data to validate the method in a realistic scenario and find that again it outperforms individual topic models.
Original languageEnglish
Pages (from-to)138-156
Number of pages19
JournalApplied Intelligence
Volume50
Issue number1
Early online date10 Jul 2019
DOIs
Publication statusPublished (in print/issue) - 31 Jan 2020

Keywords

  • Data fusion
  • Ensemble methods
  • Social media
  • Topic coherence
  • Topic models

Fingerprint

Dive into the research topics of 'Aggregated Topic Models for Increasing Social Media Topic Coherence'. Together they form a unique fingerprint.

Cite this