OAK

GIST Library Login

검색

Metadata Downloads

Abstract: Monocular Visual Odometry (VO) is a technique that estimates the position and attitude changes of a moving object using a single camera. However, it has limitations in accurate position estimation due to the ambiguity of depth information and difficulties in feature point extraction. This study proposes a method to improve the performance of monocular VO by combining the classical feature extraction algorithm SIFT (Scale-Invariant Feature Transform) with the Self-Attention mechanism of Transformers. Robust feature points are extracted using the SIFT algorithm, and feature points between consecutive image pairs are matched using the FLANN algorithm. The matched feature point pairs are used as input to the Transformer model, which then selects effective matching points for pose estimation through the Self-Attention mechanism. This approach reduces errors caused by incorrect matches and improves the accuracy of position estimation. The experiment employed the KITTI Odometry dataset, isolating straight-motion data to evaluate the potential for 6-DOF pose estimation. Forward distance estimation was then conducted. In the preprocessing phase, 10-image sequences were created, and feature coordinates were normalized for transformer model training. This research presents a novel approach of efficient matching point selection using the Attention mechanism. Through this, it demonstrated the potential of Transformer research in autonomous driving and robot vision systems.

공개 및 라이선스

qrcode

OAK GIST Repository는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.