Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PRISM: Probability Reallocation with In-Span Masking for Knowledge-Sensitive Alignment

About

Supervised fine-tuning (SFT) with token-level hard labels can amplify overconfident imitation of factually unsupported targets, causing hallucinations that propagate in multi-sentence generation. We study an augmented SFT setting in which training instances include coarse sentence-level factuality risk labels and inter-sentence dependency annotations, providing structured signals about where factual commitments are weakly supported. We propose \textbf{PRISM}, a differentiable risk-gated framework that modifies learning only at fact-critical positions. PRISM augments standard SFT with a lightweight, model-aware probability reallocation objective that penalizes high-confidence predictions on risky target tokens, with its scope controlled by span-level risk weights and model-aware gating. Experiments on hallucination-sensitive factual benchmarks and general evaluations show that PRISM improves factual aggregates across backbones while maintaining a competitive overall capability profile. Ablations further show that the auxiliary signal is most effective when used conservatively, and that knowledge masking and model-aware reallocation play complementary roles in balancing factual correction and capability preservation.

Chenning Xu, Mao Zheng, Mingyang Song• 2026

Related benchmarks

TaskDatasetResultRank
Factual Knowledge EvaluationFactual Evaluation Suite HHEM, PopQA, TriviaQA
HHEM Accuracy95.13
12
General Capability EvaluationGeneral Capability Suite MMLU, GSM8K, HumanEval, IFEval
MMLU77.54
12
Showing 2 of 2 rows

Other info

Follow for update