Skip to main navigation Skip to search Skip to main content

Polyp-LVT: Polyp segmentation with lightweight vision transformers

  • Long Lin
  • , Guangzu Lv
  • , Bin Wang
  • , Cunlu Xu
  • , Jun Liu

Research output: Contribution to journalArticlepeer-review

228 Downloads (Pure)

Abstract

Automatic segmentation of polyps in endoscopic images is crucial for early diagnosis and surgical planning of colorectal cancer. However, polyps closely resemble surrounding mucosal tissue in both texture and indistinct borders and vary in size, appearance, and location which possess great challenge to polyp segmentation. Although some recent attempts have been made to apply Vision Transformer (ViT) to polyp segmentation and achieved promising performance, their application in clinical scenarios is still limited by high computational complexity, large model size, redundant dependencies, and significant training costs. To address these limitations, we propose a novel ViT-based approach named Polyp-LVT, strategically replacing the attention layer in the encoder with a global max pooling layer, which significantly reduces the model’s parameter count and computational cost while keeping the performance undegraded. Furthermore, we introduce a network block, named Inter-block Feature Fusion Module (IFFM), into the decoder, aiming to offer a streamlined yet highly efficient feature extraction. We conduct extensive experiments on three public polyp image benchmarks to evaluate our method. The experimental results show that compared with the baseline models, our Polyp-LVT network achieves a nearly 44% reduction in model parameters while gaining comparable segmentation performance.
Original languageEnglish
Article number112181
JournalKnowledge-Based Systems
Volume300
Early online date27 Jun 2024
DOIs
Publication statusPublished (in print/issue) - 27 Sept 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier B.V.

Data Availability Statement

Data will be made available on request.

Funding

This work was supported by the Science and Technology Project of Gansu ( 22YF7GA003 , 21YF5GA102 , 21YF5GA006 , 21ZD8RA008 , 22ZD6GA029 ), the Fundamental Research Funds for the Central Universities ( lzujbky-2022-ct06 ), Supercomputing Center of Lanzhou University . This work was supported by the Science and Technology Project of Gansu (22YF7GA003, 21YF5GA102, 21YF5GA006, 21ZD8RA008, 22ZD6GA029), the Fundamental Research Funds for the Central Universities, China (lzujbky-2022-ct06), the Science and Technology Project of the Gansu Provincial Department of Transportation (2021-22), Supercomputing Center of Lanzhou University.

FundersFunder number
Lanzhou University
22ZD6GA029, 21YF5GA102, 21YF5GA006, 22YF7GA003, 21ZD8RA008
2021-22
lzujbky-2022-ct06

    UN SDGs

    This output contributes to the following UN Sustainable Development Goals (SDGs)

    1. SDG 3 - Good Health and Well-being
      SDG 3 Good Health and Well-being
    2. SDG 8 - Decent Work and Economic Growth
      SDG 8 Decent Work and Economic Growth

    Keywords

    • Polyp segmentation
    • Lightweight vision transformer
    • Pooling layer
    • Colorectal cancer

    Fingerprint

    Dive into the research topics of 'Polyp-LVT: Polyp segmentation with lightweight vision transformers'. Together they form a unique fingerprint.

    Cite this