Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering

About

Generative verifiers have emerged as a promising paradigm for step-wise verification, but their verification behavior is often poorly calibrated: they may be under-critical and miss erroneous steps, or over-critical and reject correct reasoning. We refer to this tendency to be overly lenient or overly critical as verifier strictness. In this work, we study whether verifier strictness can be controlled through hidden-state intervention. We uncover a verification-specific hidden-state signal: in step-wise verification, a verifier's tendency to accept or reject a solution step is encoded near the boundary of the corresponding verification paragraph. Exploiting this signal, we show that hidden-state steering can directly modulate verifier strictness without fine-tuning. However, uniform steering induces a trade-off between error detection and correctness certification. To address this, we propose VerifySteer, which exploits latent correctness signals for sample-level routing and selectively intervenes on paragraph boundaries. Experiments on ProcessBench and Hard2Verify show that VerifySteer outperforms prompt optimization and activation steering baselines, and is competitive with self-consistency while requiring 4-7x less inference compute. VerifySteer is also complementary to verification fine-tuning, providing further gains on top of fine-tuned verifiers. The code is available at https://github.com/YefanZhou/VerifySteer.

Yefan Zhou, Yilun Zhou, Austin Xu, Soroush Vosoughi, Shafiq Joty, Jiang Gui• 2026

Related benchmarks

TaskDatasetResultRank
Step-wise VerificationHard2Verify
TNR37.3
15
Step-wise VerificationProcessBench Overall
F1 Score72.3
13
Step-wise VerificationProcessBench GSM8K v1 (val)
True Negative Rate64.3
13
Step-wise VerificationPROCESSBENCH MATH
TNR69
13
Step-wise VerificationProcessBench OlympiadBench
TNR60.1
13
Step-wise VerificationPROCESSBENCH Omni-MATH
TNR61.9
13
Showing 6 of 6 rows

Other info

Follow for update