Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus
- Author(s)
- Lee, Seungpil; Sim, Woochang; Shin, Donghyeon; Seo, Wongyu; Park, Jiwon; Lee, Seokki; Hwang, Sanha; Kim, Sejin; Kim, Sundong
- Type
- Article
- Citation
- ACM Transactions on Intelligent Systems and Technology
- Issued Date
- 2025-01
- Abstract
- The existing methods for evaluating the inference abilities of Large Language Models (LLMs) have been predominantly results-centric, making it challenging to assess the inference process comprehensively. We introduce a novel approach using the Abstraction and Reasoning Corpus (ARC) benchmark to evaluate the inference and contextual understanding abilities of LLMs in a process-centric manner, focusing on three key components from the Language of Thought Hypothesis (LoTH): Logical Coherence, Compositionality, and Productivity. Our carefully designed experiments reveal that while LLMs demonstrate some inference capabilities, they still significantly lag behind human-level reasoning in these three aspects. The main contribution of this paper lies in introducing the LoTH perspective, which provides a method for evaluating the reasoning process that conventional results-oriented approaches fail to capture, thereby offering new insights into the development of human-level reasoning in artificial intelligence systems.
- Publisher
- Association for Computing Machinery (ACM)
- ISSN
- 2157-6904
- DOI
- 10.1145/3712701
- URI
- https://scholar.gist.ac.kr/handle/local/32165
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.