OAK

Fusing RGB and depth with Self-attention for Unseen Object Segmentation

Metadata Downloads
Author(s)
Lee, JoosoonBack, SeunghyeokKim, TaewonShin, SunghoNoh, SangjunKang, RaeyoungKim, JongwonLee, Kyoobin
Type
Conference Paper
Citation
2021 21st International Conference on Control, Automation and Systems (ICCAS), pp.1599 - 1605
Issued Date
2021-10-12
Abstract
We present a Synthetic RGB-D Fusion Mask R-CNN (SF Mask R-CNN) for unseen object instance segmentation. Our key idea is to fuse RGB and depth with a learnable spatial attention estimator, named Self-Attention-based Confidence map Estimator (SACE), in four scales upon a category-agnostic instance segmentation model. We pre-trained this SF Mask R-CNN on a large synthetic dataset and evaluated it on a public dataset, WISDOM, after fine-tuning on only a small number of real-world datasets. Our experiments showed the state-of-the-art performance of SACE in unseen object segmentation. Also, we compared the feature maps varying the input modality and fusion method and showed that SACE could be helpful to learn distinctive object-related features. The codes, dataset, and models are available at https://github.com/gist-ailab/SF-Mask-RCNN
Publisher
IEEE
Conference Place
KO
Jeju, Korea, Republic of
URI
https://scholar.gist.ac.kr/handle/local/22028
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.