Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Training with Pseudo-Code for Instruction Following

About

Despite rapid advances in the capabilities of Large Language Models (LLMs), they continue to struggle with following relatively simple and unambiguous instructions, particularly when compositional structure is involved. Recent work suggests that models may follow instructions more effectively when they are expressed in pseudo-code rather than natural language. However, writing pseudo-code programs can be tedious, and relying on few-shot demonstrations or inference-time code prompting is often unnatural for non-expert users of LLMs. To overcome these limitations, we propose a training time approach that fine-tunes LLMs using instruction-tuning data augmented with pseudo-code representations of natural language instructions paired with final responses. We evaluate our method on 12 publicly available benchmarks spanning instruction-following, mathematical reasoning, and commonsense reasoning, across six base models. Our results show that models trained with pseudo-code follow instructions more reliably, achieving relative gains of 8-21\% on instruction following benchmarks, while largely preserving and in some cases improving performance on mathematical and commonsense reasoning tasks, with an average gain of up to 30\% across all evaluated benchmarks.

Prince Kumar, Rudra Murthy, Riyaz Bhat, Danish Contractor• 2025

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningCommonsense Reasoning Suite (test)
HellaSwag Accuracy0.71
62
General LLM EvaluationInstruction-Following, Mathematics, and Commonsense Reasoning Combined
Average Score57
18
MathematicsMathematics Suite
GSM8K Accuracy73
18
Instruction FollowingInstruction-Following Suite
IFEval Score48
18
Showing 4 of 4 rows

Other info

Follow for update