Dual contextual module for neural machine translation

Isaac Ampomah, Sally I McClean, Glenn Hawe

Research output: Contribution to journalArticlepeer-review

3 Citations (Scopus)
56 Downloads (Pure)

Abstract

Self-attention-based encoder-decoder frameworks have drawn increasing attention in recent years. The self-attention mechanism generates contextual representations by attending to all tokens in the sentence. Despite improvements in performance, recent research argues that the self-attention mechanism tends to concentrate more on the global context with less emphasis on the contextual information available within the local neighbourhood of tokens. This work presents the Dual Contextual (DC) module, an extension of the conventional self-attention unit, to effectively leverage both the local and global contextual information. The goal is to further improve the sentence representation ability of the encoder and decoder subnetworks, thus enhancing the overall performance of the translation model. Experimental results on WMT’14 English-German (En→De) and eight IWSLT translation tasks show that the DC module can further improve the translation performance of the Transformer model.
Original languageEnglish
JournalMachine Translation
Early online date12 Oct 2021
DOIs
Publication statusPublished online - 12 Oct 2021

Bibliographical note

Publisher Copyright:
© 2021, The Author(s).

Keywords

  • Deep neural representation learning
  • Self-attention networks
  • Local contexts
  • Global contexts

Fingerprint

Dive into the research topics of 'Dual contextual module for neural machine translation'. Together they form a unique fingerprint.

Cite this