CIFFormer: A contextual information flow guided transformer for colorectal polyp segmentation

Cunlu Xu, Long Lin, Bin Wang, J. Liu

Research output: Contribution to journalArticlepeer-review

Abstract

Automatic segmentation of polyps in endoscopic images plays a critical role in the early diagnosis of colorectal cancer. In recent years, Visual Transformers, especially pyramid vision transformers, have achieved remarkable strides and become dominating methods in polyp segmentation. However, due to the high resemblance between polyps and normal tissues in terms of size, appearance, color, and other aspects, the pyramid vision transformer methods still face the challenges of the representation of fine-grained details and identifying highly disguised polyps that could be pivotal in precise segmentation of colorectal polyp. To address these challenges, we propose a novel Contextual Information Flow Guided Transformer (CIFFormer) for colorectal polyp segmentation to reconstruct the architecture of a pyramid vision transformer via a contextual information flow design. Our proposed method utilizes a pyramid-structured encoder to obtain multi-resolution feature maps. To effectively capture the target object’s features at various levels of detail, from coarse-grained global information to fine-grained local information, we design a Global-Local Feature Synergy Fusion module (GLFS). GLFS adopts a progressive fusion strategy, first fusing the features of adjacent hierarchy and then gradually fusing across the hierarchy. This allows the model to utilize the features of different semantic levels better and avoid the information loss caused by direct fusion. In addition, we also introduce a Multi-Density Global-Local Residual Module (MDGL). The multi-density residual units within MDGL improve feature propagation and information flow. By employing both local and global residual learning, the model gains a better ability to capture detailed information at both global and local scales. The experimental results demonstrate that our CIFFormer model surpasses 17 benchmark models and achieves state-of-the-art performance on five popular datasets. Furthermore, our model exhibits remarkable performance on two video datasets as well. The source code of this work is available at https://github.com/lonlin404/CIFFormer
Original languageEnglish
Article number130413
Pages (from-to)1-12
Number of pages12
JournalNeurocomputing
Volume644
Early online date15 May 2025
DOIs
Publication statusPublished online - 15 May 2025

Bibliographical note

Publisher Copyright:
© 2025 Elsevier B.V.

Data Access Statement

I have share the link about the data and code in the manuscript.

Keywords

  • Contextual information flow
  • Multi density residual
  • Polyp segmentation
  • Pyramid vision transformer

Fingerprint

Dive into the research topics of 'CIFFormer: A contextual information flow guided transformer for colorectal polyp segmentation'. Together they form a unique fingerprint.

Cite this