Aggregated Topic Models for Increasing Social Media Topic Coherence

Stuart Blair, Y Bi, Maurice Mulvenna

Research output: Contribution to journalArticlepeer-review

56 Citations (Scopus)
180 Downloads (Pure)


This research presents a novel aggregating method for constructing an aggregated topic model that is composed of the topics with greater coherence than individual models. When generating a topic model, a number of parameters have to be specified. The resulting topics can be very general or very specific, which depend on the chosen parameters. In this study we investigate the process of aggregating multiple topic models generated using different parameters with a focus on whether combining the general and specific topics is able to increase topic coherence. We employ cosine similarity and Jensen-Shannon divergence to compute the similarity among topics and combine them into an aggregated model when their similarity scores exceed a predefined threshold. The model is evaluated against the standard topics models generated by the latent Dirichlet allocation and Non- negative Matrix Factorisation. Specifically we use the coherence of topics to compare the individual models that create aggregated models against those of the aggregated model and models generated by Non-negative Matrix Factorisation, respectively. The results demonstrate that the aggregated model outperforms those topic models at a statistically significant level in terms of topic coherence over an external corpus. We also make use of the aggregated topic model on social media data to validate the method in a realistic scenario and find that again it outperforms individual topic models.
Original languageEnglish
Pages (from-to)138-156
Number of pages19
JournalApplied Intelligence
Issue number1
Early online date10 Jul 2019
Publication statusPublished (in print/issue) - 31 Jan 2020


  • Data fusion
  • Ensemble methods
  • Social media
  • Topic coherence
  • Topic models


Dive into the research topics of 'Aggregated Topic Models for Increasing Social Media Topic Coherence'. Together they form a unique fingerprint.

Cite this