OAK

Fusing RGB and depth with Self-attention for Unseen Object Segmentation

Metadata Downloads
Abstract
We present a Synthetic RGB-D Fusion Mask R-CNN (SF Mask R-CNN) for unseen object instance segmentation. Our key idea is to fuse RGB and depth with a learnable spatial attention estimator, named Self-Attention-based Confidence map Estimator (SACE), in four scales upon a category-agnostic instance segmentation model. We pre-trained this SF Mask R-CNN on a large synthetic dataset and evaluated it on a public dataset, WISDOM, after fine-tuning on only a small number of real-world datasets. Our experiments showed the state-of-the-art performance of SACE in unseen object segmentation. Also, we compared the feature maps varying the input modality and fusion method and showed that SACE could be helpful to learn distinctive object-related features. The codes, dataset, and models are available at https://github.com/gist-ailab/SF-Mask-RCNN
Author(s)
Lee, JoosoonBack, SeunghyeokKim, TaewonShin, SunghoNoh, SangjunKang, RaeyoungKim, JongwonLee, Kyoobin
Issued Date
2021-10-12
Type
Conference Paper
DOI
10.23919/iccas52745.2021.9649991
URI
https://scholar.gist.ac.kr/handle/local/22028
Publisher
IEEE
Citation
2021 21st International Conference on Control, Automation and Systems (ICCAS), pp.1599 - 1605
ISSN
1598-7833
Conference Place
KO
Jeju, Korea, Republic of
Appears in Collections:
Department of AI Convergence > 2. Conference Papers
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.