Sage Journals: Discover world-class research

Abstract

Traditional vision SLAM, which assumes a static environment, is unable to handle dynamic objects in dynamic environments. The presence of dynamic feature points decreases the localization accuracy and robustness of SLAM. In light of this, we propose YER-SLAM, a dynamic SLAM system employing object detection and region-growing algorithm, which fully utilizes the semantic and geometric information contained in images to identify and eliminate dynamic features in the environment. Firstly, we use YOLOv5 object detection to classify the image into static, dynamic, and potentially dynamic regions to generate a semantic mask that incorporates a priori knowledge. Secondly, we utilize a dynamic object detection algorithm that tightly couples object detection with epipolar constraints to initially identify dynamic features in the scene. Subsequently, we propose a novel strategy for the elimination of dynamic feature points, which integrates the acquired dynamic point with a region-growing algorithm to generate a mask for the dynamic region, thereby enabling the exclusion of the region’s feature points in the tracking process. Finally, the experimental results from the TUM dataset reveal that our algorithm achieves a 94.71% reduction in the average absolute trajectory error in highly dynamic environments, compared to ORBSLAM2. YER-SLAM presented in this paper effectively improves localization accuracy in dynamic environments.

Keywords

Visual SLAM YOLOv5 region Growing

Get full access to this article

View all access options for this article.

References

Bescos

Facil

J. M.

Civera

Neira

(2018). DynaSLAM: Tracking, mapping, and inpainting in dynamic scenes. IEEE Robotics and Automation Letters, 3(4), 4076–4083.

Cadena

Carlone

Carrillo

Latif

Scaramuzza

Neira

Reid

Leonard

J. J

(2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.

Campos

Elvira

Rodriguez

J. J. G.

Montiel

J. M. M.

Tardos

J. D.

(2021). ORB-SLAM3: An accurate open-source library for visual, visual-inertial, and multimap SLAM. IEEE Transactions on Robotics, 37(6), 1874–1890.

Chen

W. F.

Shang

G. T.

A. H.

Zhou

C. J.

Wang

X. Y.

C. H.

Z. X.

(2022). An overview on visual SLAM: From tradition to semantic. Remote Sensing, 14(13), 9297.

Cheng

Zhang

L. Y.

Chen

Q. H.

X. R.

Cai

J. C.

(2022). A review of visual SLAM methods for autonomous driving vehicles. Engineering Applications of Artificial Intelligence, 114, 104992.

Dai

W. C.

Zhang

Fang

Scherer

(2022). RGB-D SLAM in dynamic environments using point correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(1), 373–389.

Engel

Schöps

Cremers

(2014). LSD-SLAM: Large-scale direct monocular SLAM. Computer Vision - Eccv 2014, Pt Ii, 8690, 834–849.

Fan

Zhang

Tang

Liu

Han

(2022). Blitz-SLAM: A semantic SLAM in dynamic environments. Pattern Recognition, 121, 108225.

Fang

Xie

Chen

Huang

Zarei

Xie

(2023). DYS-SLAM: A real-time RGBD SLAM combined with optical flow and semantic information in a dynamic environment 1. Journal of Intelligent & Fuzzy Systems, 45, 10349–10367.

10.

Fischler

M. A.

Bolles

R. C.

(1987). Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography (pp. 726–740). Morgan Kaufmann, San Francisco (CA).

11.

Forster

Pizzoli

Scaramuzza

(2014). SVO: Fast semi-direct monocular visual odometry. In 2014 IEEE International conference on robotics and automation (Icra) (pp. 15–22).

12.

Girshick

(2015). Fast R-CNN. In 2015 IEEE International conference on computer vision (Iccv) (pp. 1440–1448).

13.

Girshick

Donahue

Darrell

Malik

(2014). Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on computer vision and pattern recognition (Cvpr) (pp. 580–587).

14.

X. G.

Zhang

Y. Z.

Cao

Z. Z.

Y. M.

Deng

Z. Q.

Sun

W. K.

(2022). CFP-SLAM: A real-time visual slam based on coarse-to-fine probability in dynamic environments. In 2022 IEEE/Rsj International conference on intelligent robots and systems (Iros) (pp. 4399–4406).

15.

Klein

Murray

(2007). Parallel tracking and mapping for small AR workspaces. (pp. 225–234).

16.

S. L.

Lee

(2017). RGB-D SLAM in dynamic environments using static point weighting. IEEE Robotics and Automation Letters, 2(4), 2263–2270.

17.

Liu

Guo

Zhang

(2022). YKP-SLAM: A visual SLAM based on static probability update strategy for dynamic environments. Electronics, 11(18), 2872.

18.

Liu

Anguelov

Erhan

Szegedy

Reed

C. Y.

Berg

A. C.

(2016). SSD: Single shot multiBox detector. Computer Vision - Eccv 2016, Pt I, 9905, 21–37.

19.

Mur-Artal

Tardos

J. D.

(2017). ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics, 33(5), 1255–1262.

20.

Redmon

Divvala

Girshick

Farhadi

(2016). You only look once: Unified, real-time object detection. In 2016 IEEE Conference on computer vision and pattern recognition (Cvpr) (pp. 779–788).

21.

Sturm

Engelhard

Endres

Burgard

Cremers

(2012). A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/Rsj International conference on intelligent robots and systems (Iros) (pp. 573–580).

22.

Sun

Y. X.

Liu

Meng

M. Q. H.

(2017). Improving RGB-D SLAM in dynamic environments: A motion removal approach. Robotics and Autonomous Systems, 89, 110–122.

23.

Guo

Gao

You

Liu

Chen

(2022). YOLO-SLAM: A semantic SLAM system towards dynamic environment with geometric constraint. Neural Computing and Applications, 34(8), 6011–6026.

24.

Wang

Y. Y.

Huang

J. N.

Qin

(2022). ESD-SLAM: An efficient semantic visual SLAM towards dynamic environments. Journal of Intelligent & Fuzzy Systems, 42(6), 5155–5164.

25.

Liu

Z. X.

Liu

X. J.

Xie

F. G.

Yang

Wei

Fei

(2018). DS-SLAM: A semantic visual slam towards dynamic environments. In 2018 IEEE/Rsj International conference on intelligent robots and systems (Iros) (pp. 1168–1174).

26.

Zhang

C. Y.

Zhang

R. C.

Jin

X. F.

(2022a). PFD-SLAM: A new RGB-D SLAM for dynamic indoor environments based on non-prior semantic segmentation. Remote Sensing, 14(10), 2445.

27.

Zhang

T. W.

Zhang

H. Y.

Nakamura

Zhang

(2020). FlowFusion: Dynamic dense RGB-D SLAM based on optical flow. In 2020 IEEE International conference on robotics and automation (Icra) (pp. 7322–7328).

28.

Zhang

X. G.

Wang

X. K.

Zhang

R. D.

(2022b). Dynamic semantics slam based on improved mask R-CNN. IEEE Access, 10, 126525.

29.

Zhang

X. G.

Zhang

R. D.

Wang

X. K.

(2022c). Visual SLAM mapping based on YOLOv5 in dynamic scenes. Applied Sciences-Basel, 12(22), 11548.

30.

Zhong

F. W.

Wang

Zhang

Z. Q.

Zhou

Wang

Y. Z.

(2018). Detect-SLAM: Making object detection and slam mutually beneficial. In 2018 IEEE Winter conference on applications of computer vision (Wacv 2018) (pp. 1001–1010).

31.

Zhong

Y. H.

S. S.

Huang

Bai

Q. M.

(2022). WF-SLAM: A robust VSLAM for dynamic scenarios via weighted features. IEEE Sensors Journal, 22(11), 10818–10827.