OAK

Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus

Metadata Downloads
Author(s)
Lee, SeungpilSim, WoochangShin, DonghyeonSeo, WongyuPark, JiwonLee, SeokkiHwang, SanhaKim, SejinKim, Sundong
Type
Article
Citation
ACM Transactions on Intelligent Systems and Technology
Issued Date
2025-01
Abstract
The existing methods for evaluating the inference abilities of Large Language Models (LLMs) have been predominantly results-centric, making it challenging to assess the inference process comprehensively. We introduce a novel approach using the Abstraction and Reasoning Corpus (ARC) benchmark to evaluate the inference and contextual understanding abilities of LLMs in a process-centric manner, focusing on three key components from the Language of Thought Hypothesis (LoTH): Logical Coherence, Compositionality, and Productivity. Our carefully designed experiments reveal that while LLMs demonstrate some inference capabilities, they still significantly lag behind human-level reasoning in these three aspects. The main contribution of this paper lies in introducing the LoTH perspective, which provides a method for evaluating the reasoning process that conventional results-oriented approaches fail to capture, thereby offering new insights into the development of human-level reasoning in artificial intelligence systems.
Publisher
Association for Computing Machinery (ACM)
ISSN
2157-6904
DOI
10.1145/3712701
URI
https://scholar.gist.ac.kr/handle/local/32165
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.