Skip to main navigation Skip to search Skip to main content

Semantic enrichment of neural word embeddings: Leveraging taxonomic similarity for enhanced distributional semantics

Research output: Contribution to journalArticlepeer-review

50 Downloads (Pure)

Abstract

Data-driven neural word embeddings (NWEs), grounded in distributional semantics, can capture various ranges of linguistic regularities, which can be further enriched by incorporating structured knowledge resources. This work proposes a novel post-processing approach for injecting semantic relationships into the vector space of both static and contextualized NWEs. Current solutions to retrofitting (RF) word embeddings often oversimplify the integration of semantic knowledge, neglecting the nuanced differences between relationships, which may result in suboptimal performance. Instead of applying multi-thresholding to distance boundaries in metric learning, we compute taxonomic similarity to dynamically adjust these boundaries during the semantic specialization of word embeddings. Benchmark evaluations on both static and contextualized word embeddings demonstrate that our dynamic-fitting (DF) approach produces SOTA correlation results of 0.78 and 0.76 on SimLex-999 and SimVerb-3500, respectively, highlighting the effectiveness of incorporating multiple semantic relationships in refining vector semantics. Our approach also outperforms existing RF methods in both supervised and unsupervised semantic relationships recognition tasks. It achieves top accuracy scores for hypernymy detection on the BLESS, WBLESS, and BIBLESS datasets (0.97, 0.89, and 0.83, respectively) and an F1 score of over 0.60 on four types of semantic relationship classification in the shared Subtask-2 of CogALex-V, surpassing all participant systems. In the analogy reasoning task of the Bigger Analogy Test Set, our approach outperforms existing RF methods on inferring relational similarity. These consistent improvements across various lexical semantics tasks suggest that our DF approach can effectively integrate distributional semantics with symbolic knowledge resources, thereby enhancing the representation capacity of word embeddings in downstream applications.
Original languageEnglish
Pages (from-to)1423-1449
Number of pages27
JournalNatural Language Processing
Volume31
Issue number6
Early online date30 Jul 2025
DOIs
Publication statusPublished (in print/issue) - 30 Nov 2025

Bibliographical note

Publisher Copyright:
© The Author(s), 2025.

Keywords

  • neural word embeddings
  • distributional semantics
  • metric learning
  • semantic similarity

Fingerprint

Dive into the research topics of 'Semantic enrichment of neural word embeddings: Leveraging taxonomic similarity for enhanced distributional semantics'. Together they form a unique fingerprint.

Cite this