Abstract
Most of existing saliency object detection models are based on fully convolutional networks (FCNs), which learn multi-scale/level semantic information through convolutional layers to obtain high-quality predicted saliency maps. However, convolution is locally interactive, and thus it is difficult to capture remote dependencies. Additionally, FCNs-based methods suffer from coarse object boundaries. In this paper, to solve these problems, we propose a novel transformer framework for salient object detection (named TF-SOD), which consists of the encoder part of the FCN, the fusion module (FM), the transformer module (TM)
and the feature decoder module (FDM). Specifically, the FM is a bridge connecting the encoder and TM, and provides some foresight for the non-local interaction of the TM. FDM can efficiently decode the non-local features output by the TM, and achieve deep fusion with local features. This architecture enables the network to achieve a close integration of local and non-local interactions, making information complementary to each other, deeply mining the associated information between features. Furthermore, we also propose a novel edge reinforcement learning strategy, which can effectively suppress edge blurring from local and global aspects by means of a powerful network architecture. Extensive experiments using five datasets demonstrate that the proposed method outperforms 19 state-of-the-art methods.
and the feature decoder module (FDM). Specifically, the FM is a bridge connecting the encoder and TM, and provides some foresight for the non-local interaction of the TM. FDM can efficiently decode the non-local features output by the TM, and achieve deep fusion with local features. This architecture enables the network to achieve a close integration of local and non-local interactions, making information complementary to each other, deeply mining the associated information between features. Furthermore, we also propose a novel edge reinforcement learning strategy, which can effectively suppress edge blurring from local and global aspects by means of a powerful network architecture. Extensive experiments using five datasets demonstrate that the proposed method outperforms 19 state-of-the-art methods.
Original language | English |
---|---|
Journal | Neural Computing and Applications |
Publication status | Accepted/In press - 7 Feb 2022 |
Keywords
- Salient object detection
- Fusion module
- Transformer module
- Feature decoder
- Edge reinforcement learning strategy