High-Precision Feature Pair Selection Techniques in Monocular Visual Odometry Using Transformer Models
- Author(s)
- 신건우
- Type
- Thesis
- Degree
- Master
- Department
- 대학원 AI대학원
- Advisor
- Lee, Yong-Gu
- Abstract
- Monocular Visual Odometry (VO) is a technique that estimates the position and attitude changes of a moving object using a single camera. However, it has limitations in accurate position estimation due to the ambiguity of depth information and difficulties in feature point extraction. This study proposes a method to improve the performance of monocular VO by combining the classical feature extraction algorithm SIFT (Scale-Invariant Feature Transform) with the Self-Attention mechanism of Transformers. Robust feature points are extracted using the SIFT algorithm, and feature points between consecutive image pairs are matched using the FLANN algorithm. The matched feature point pairs are used as input to the Transformer model, which then selects effective matching points for pose estimation through the Self-Attention mechanism. This approach reduces errors caused by incorrect matches and improves the accuracy of position estimation. The experiment employed the KITTI Odometry dataset, isolating straight-motion data to evaluate the potential for 6-DOF pose estimation. Forward distance estimation was then conducted. In the preprocessing phase, 10-image sequences were created, and feature coordinates were normalized for transformer model training. This research presents a novel approach of efficient matching point selection using the Attention mechanism. Through this, it demonstrated the potential of Transformer research in autonomous driving and robot vision systems.
- URI
- https://scholar.gist.ac.kr/handle/local/19341
- Fulltext
- http://gist.dcollection.net/common/orgView/200000864006
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.