OAK

High-Precision Feature Pair Selection Techniques in Monocular Visual Odometry Using Transformer Models

Metadata Downloads
Author(s)
신건우
Type
Thesis
Degree
Master
Department
대학원 AI대학원
Advisor
Lee, Yong-Gu
Abstract
Monocular Visual Odometry (VO) is a technique that estimates the position and attitude changes of a moving object using a single camera. However, it has limitations in accurate position estimation due to the ambiguity of depth information and difficulties in feature point extraction. This study proposes a method to improve the performance of monocular VO by combining the classical feature extraction algorithm SIFT (Scale-Invariant Feature Transform) with the Self-Attention mechanism of Transformers. Robust feature points are extracted using the SIFT algorithm, and feature points between consecutive image pairs are matched using the FLANN algorithm. The matched feature point pairs are used as input to the Transformer model, which then selects effective matching points for pose estimation through the Self-Attention mechanism. This approach reduces errors caused by incorrect matches and improves the accuracy of position estimation. The experiment employed the KITTI Odometry dataset, isolating straight-motion data to evaluate the potential for 6-DOF pose estimation. Forward distance estimation was then conducted. In the preprocessing phase, 10-image sequences were created, and feature coordinates were normalized for transformer model training. This research presents a novel approach of efficient matching point selection using the Attention mechanism. Through this, it demonstrated the potential of Transformer research in autonomous driving and robot vision systems.
URI
https://scholar.gist.ac.kr/handle/local/19341
Fulltext
http://gist.dcollection.net/common/orgView/200000864006
Alternative Author(s)
Gunwoo Shin
Appears in Collections:
Department of AI Convergence > 3. Theses(Master)
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.