Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Meaningful Learning: Enhancing Abstract Reasoning in Large Language Models via Generic Fact Guidance

About

Large language models (LLMs) have developed impressive performance and strong explainability across various reasoning scenarios, marking a significant stride towards mimicking human-like intelligence. Despite this, when tasked with several simple questions supported by a generic fact, LLMs often struggle to abstract and apply the generic fact to provide consistent and precise answers, revealing a deficiency in abstract reasoning abilities. This has sparked a vigorous debate about whether LLMs are genuinely reasoning or merely memorizing. In light of this, we design a preliminary study to quantify and delve into the abstract reasoning abilities of existing LLMs. Our findings reveal a substantial discrepancy between their general reasoning and abstract reasoning performances. To relieve this problem, we tailor an abstract reasoning dataset (AbsR) together with a meaningful learning paradigm to teach LLMs how to leverage generic facts for reasoning purposes. The results show that our approach not only boosts the general reasoning performance of LLMs but also makes considerable strides towards their capacity for abstract reasoning, moving beyond simple memorization or imitation to a more nuanced understanding and application of generic facts. The code is available at https://github.com/Waste-Wood/MeanLearn.

Kai Xiong, Xiao Ding, Ting Liu, Bing Qin, Dongliang Xu, Qing Yang, Hongtao Liu, Yixin Cao• 2024

Related benchmarks

TaskDatasetResultRank
Multi-task Language UnderstandingMMLU
Accuracy58.98
842
Language UnderstandingMMLU
Accuracy83.61
756
Question AnsweringARC Challenge
Accuracy80.2
749
ReasoningBBH
Accuracy72.85
507
Question AnsweringARC Easy
Normalized Acc89.9
385
Reading ComprehensionRACE high
Accuracy75.53
295
Reading ComprehensionRACE mid
Accuracy81.82
196
Question AnsweringARC-C
Accuracy82
166
Question AnsweringCommonsenseQA
Accuracy69.12
143
Commonsense ReasoningCommonsenseQA
Accuracy66.5
132
Showing 10 of 16 rows

Other info

Follow for update