Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Scientific Knowledge-driven Decoding Constraints Improving the Reliability of LLMs

About

Large language models (LLMs) have shown strong knowledge reserves and task-solving capabilities, but still face the challenge of severe hallucination, hindering their practical application. Though scientific theories and rules can efficiently direct the behaviors of human manipulators, LLMs still do not utilize these highly-condensed knowledge sufficiently through training or prompting. To address this issue, we propose \textbf{SciDC}, an LLM generation method that integrate subject-specific knowledge with strong constraints. By adopting strong LLMs to automatically convert flexible knowledge into multi-layered, standardized rules, we build an extensible framework to effectively constrain the model generation on domain tasks. Experiments on scientific tasks including industrial formulation design, clinical tumor diagnosis and retrosynthesis planning, consistently demonstrate the effectiveness of our method, achieving a 12\% accuracy improvement on average compared with vanilla generation. We further discuss the potential of LLMs in automatically inductively summarizing highly-condensed knowledge, looking ahead to practical solutions for accelerating the overall scientific research process. All the code of this paper can be obtained (https://github.com/Maotian-Ma/SciDC).

Maotian Ma, Zheni Zeng, Zhenghao Liu, Yukun Yan• 2026

Related benchmarks

TaskDatasetResultRank
Legal ReasoningLegalBench Hearsay
Accuracy86.46
16
Formulation QAFormulation QA (Standard)
Accuracy56.7
12
RetrosynthesisRetrosynthesis
Validity100
8
Tumor diagnosisTumor diagnosis
Validity100
8
Formulation designFormulation design
Validity75.5
8
Formulation QAFormulation QA (OOD)
Accuracy38.3
6
Constraint Code GenerationTumor Diagnosis Human Evaluation Samples
Correctness5
3
Showing 7 of 7 rows

Other info

Follow for update