OAK

Reasoning Abilities of Large Language Models: In-Depth Analysis on the Abstraction and Reasoning Corpus

Metadata Downloads
Author(s)
Lee, SeungpilSim, WoochangShin, DonghyeonSeo, WongyuPark, JiwonLee, SeokkiHwang, SanhaKim, SejinKim, Sundong
Type
Article
Citation
ACM Transactions on Intelligent Systems and Technology, v.16, no.6, pp.1 - 52
Issued Date
2025-12
Abstract
The existing methods for evaluating the inference abilities of Large Language Models (LLMs) have been predominantly results-centric, making it challenging to assess the inference process comprehensively. We introduce a novel approach using the Abstraction and Reasoning Corpus (ARC) benchmark to evaluate the inference and contextual understanding abilities of LLMs in a process-centric manner, focusing on three key components from the Language of Thought Hypothesis (LoTH): Logical Coherence, Compositionality, and Productivity. Our carefully designed experiments reveal that while LLMs demonstrate some inference capabilities, they still significantly lag behind human-level reasoning in these three aspects. The main contribution of this paper lies in introducing the LoTH perspective, which provides a method for evaluating the reasoning process that conventional results-oriented approaches fail to capture, thereby offering new insights into the development of human-level reasoning in artificial intelligence systems.
Publisher
Association for Computing Machinery (ACM)
ISSN
2157-6904
DOI
10.1145/3712701
URI
https://scholar.gist.ac.kr/handle/local/32165
Authorize & License
  • Authorize공개
Files in This Item:
  • There are no files associated with this item.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.