Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Let Me Speak Freely? A Study on the Impact of Format Restrictions on Performance of Large Language Models

About

Structured generation, the process of producing content in standardized formats like JSON and XML, is widely utilized in real-world applications to extract key output information from large language models (LLMs). This study investigates whether such constraints on generation space impact LLMs abilities, including reasoning and domain knowledge comprehension. Specifically, we evaluate LLMs performance when restricted to adhere to structured formats versus generating free-form responses across various common tasks. Surprisingly, we observe a significant decline in LLMs reasoning abilities under format restrictions. Furthermore, we find that stricter format constraints generally lead to greater performance degradation in reasoning tasks.

Zhi Rui Tam, Cheng-Kuang Wu, Yi-Lin Tsai, Chieh-Yen Lin, Hung-yi Lee, Yun-Nung Chen• 2024

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy (GSM8K)74.7
358
Symbolic ReasoningLast Letter
Accuracy0.701
21
Logical reasoningShuffled Objects
Accuracy27
19
Image ClassificationSports
Top-1 Acc76.1
14
ReasoningLast Letter Concatenation zero-shot
Accuracy (Zero-shot)70.1
4
ClassificationMultiFin
Accuracy70
4
ReasoningGSM8K zero-shot
Accuracy86.5
4
ReasoningShuffled Objects zero-shot
Accuracy49.4
4
ClassificationTask280
Accuracy69.8
4
ClassificationDDXPlus
Accuracy22.9
4
Showing 10 of 10 rows

Other info

Follow for update