Traditional visual place recognition (VPR) methods generally use frame-based cameras, which will easily fail due to rapid illumination changes or fast motion. To overcome this, we propose an end-to-end VPR network using event cameras, which can achieve good recognition performance in challenging environments (e.g., large-scale driving scenes). The key idea of the proposed algorithm is first to characterize the event streams with the EST voxel grid representation, then extract features using a deep residual network, and, finally, aggregate features using an improved VLAD network to realize end-to-end VPR using event streams. To verify the effectiveness of the proposed algorithm, on the event-based driving datasets (MVSEC, DDD17, and Brisbane-Event-VPR) and the synthetic event datasets (Oxford RobotCar and CARLA), we analyze the performance of our proposed method on large-scale driving sequences, including cross-weather, cross-season, and illumination changing scenes, and then, we compare the proposed method with the state-of-the-art event-based VPR method (Ensemble-Event-VPR) to prove its advantages. Experimental results show that the performance of the proposed method is better than that of the event-based ensemble scheme in challenging scenarios. To the best of our knowledge, for the VPR task, this is the first end-to-end weakly supervised deep network architecture that directly processes event stream data.
|Number of pages||18|
|Journal||IEEE Transactions on Instrumentation and Measurement|
|Early online date||20 Apr 2022|
|Publication status||Published - 16 May 2022|
Bibliographical noteFunding Information:
This work was supported in part by the National Natural Science Foundation of China under Grant 62073066 and Grant U20A20197, in part by the Science and Technology on Near-Surface Detection Laboratory under Grant 6142414200208, in part by the Fundamental Research Funds for the Central Universities under Grant N2226001, and in part by the Aeronautical Science Foundation of China under Grant 201941050001
© 1963-2012 IEEE.
- Electrical and Electronic Engineering
- event camera
- visual place recognition (VPR)
- triplet ranking loss
- Deep residual network
- event spike tensor (EST)