Perception and Grasping of Unseen Objects in Cluttered Robotic Environments
- Abstract
- Intelligent robotic systems are expected to perceive and interact with unseen objects in unstructured environments, even when these objects are novel and presented in cluttered and occluded scenes. However, existing methods are typically limited to perceiving and manipulating only a specific set of known objects in structured environments, posing significant challenges in extending robotic manipulation capabilities to various domains. This thesis focuses on developing advanced robotic perception systems and datasets to accurately detect and grasp unseen objects in cluttered scenes by exploiting amodal perception, error-informed refinement, and large-scale real datasets.
To address the challenges of unseen object perception in cluttered environments, we propose UOAIS-Net, a novel neural network that simultaneously detects visible masks, amodal masks, and occlusions on unseen objects. By employing hierarchical occlusion modeling to reason about occlusion, UOAIS-Net achieves state-of-the-art performance on three real-world cluttered benchmarks. Moreover, we present INSTA-BEER, a fast and accurate model-agnostic refinement method for unseen object instance segmentation, which predicts pixel-wise errors in the initial segmentation and refines the segmentation guided by these error estimates. The proposed quad-metric boundary error and the Error Guidance Fusion module significantly improve segmentation quality and grasping performance on three challenging datasets and in real-world robotic experiments.
To further enhance instance segmentation, we propose SEED, a lightweight plugin module that introduces a self-error estimation head for high-quality instance segmentation. SEED predicts pixel-wise errors in the initial mask predictions, enabling error-targeted dual refinement of both bounding boxes and masks. This approach consistently outperforms state-of-the-art methods on three diverse datasets while maintaining computational efficiency. Additionally, we introduce GraspClutter6D, a large-scale dataset designed to train and evaluate robotic perception and grasping in diverse, cluttered, and occluded environments. GraspClutter6D captures challenging scenarios that closely resemble real-world practical settings, and training on GraspClutter6D significantly improves robotic grasping performance compared to models trained on other datasets. The proposed methods and datasets contribute to the advancement of unseen object perception and grasping in cluttered robotic environments, enabling robots to accurately perceive and interact with novel objects in complex, real-world scenarios.
- Author(s)
- Seunghyeok Back
- Issued Date
- 2024
- Type
- Thesis
- URI
- https://scholar.gist.ac.kr/handle/local/19569
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.