OAK

GIST Library Login

GIST Scholar College of Information and Computing Department of AI Convergence 3. Theses(Master)

Solution Augmentation for ARC Problems Using GFlowNet: A Probabilistic Exploration Approach

Metadata Downloads

Author(s): 황산하

Type: Thesis

Degree: Master

Department: 대학원 AI대학원

Advisor: Kim, Sundong

Abstract: This study presents an algorithm that enables human-like reasoning by leveraging GFlowNet and proposes its application to the Abstraction and Reasoning Corpus (ARC) problem. The ARC problem is a challenging task that requires inferring diverse rules from a limited number of examples, making it difficult for existing AI models to solve effectively. To address this, we utilize GFlowNet's probabilistic path exploration capabilities based on the Geometric distribution to efficiently explore the solution space of the ARC problem and augment potential solutions. Specifically, the Geometric distribution facilitates goal-oriented exploration, resulting in shorter and more efficient solutions. Additionally, we introduce Goal-conditioned Reward Modeling, designed through the analysis of human solutions, to develop a more diverse reward distribution compared to traditional sparse rewards. Furthermore, we enhance learning stability by mitigating initial data sampling bias through an off-policy learning approach. Experimental results demonstrate that our proposed GFlowNet-based method generates a wider variety of solutions for the ARC problem and converges to the correct answers more rapidly than existing approaches. This confirms the potential of applying GFlowNet to enhance solution generation in complex reasoning tasks.|본 연구는 GFlowNet을 활용하여 인간과 유사한 추론을 가능하게 하는 알고리즘을 개발하고, 이를 Abstraction and Reasoning Corpus(ARC) 문제에 적용하는 방법을 제안한다. ARC 문제는 매우 적은 수의 예시(2~5쌍)로부터 다양한 규칙을 추론해야 하는 복잡한 과제로, 기존의 딥러닝 및 강화학습 모델들을 통해서 해결하기 어려워 하고 있다. 본 논문에서는 GFlowNet을 ARC풀이에 활용하여 Solution을 증강하기 위해 기하분포의 확률적 경로 탐색 능력을 활용하여 ARC 문제의 솔루션 공간을 효과적으로 탐색한다. 특히, 기하분포를 사용하는 것은 목표 지향적 탐색을 통해 더 짧고 효율적인 솔루션을 생성할 수 있다. 또한, 사람의 풀이를 분석하여 설계한 Goal-conditioned Reward Modeling을 도입하여 sparse reward보다 더 다양한 reward 분포를 설계하고, off-policy 학습 방식을 통해 초기 데이터 샘플링 편향을 해결하여 학습의 안정성을 강화하였다. 실험 결과, 본 연구에서 제안한 GFlowNet 기반 접근법은 ARC 문제에서 다양한 솔루션을 생성하면서도 더 빠르게 정답에 수렴하는 성능을 보였으며, 복잡한 추론 태스크에 GFlowNet을 적용하여 Solution을 증강할 수 있음을 보였다.

URI: https://scholar.gist.ac.kr/handle/local/19682

Fulltext: http://gist.dcollection.net/common/orgView/200000867981

Alternative Author(s): Sanha Hwang

Appears in Collections:: Department of AI Convergence > 3. Theses(Master)

메타데이터 간략히 보기메타데이터 전체 보기

공개 및 라이선스

공개 구분공개

qrcode

트윗하기

OAK GIST Scholar는 국립중앙도서관 OAK Repository 보급사업으로 구축되었습니다.