Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Chain-of-Authorization: Embedding authorization into large language models

About

Although Large Language Models (LLMs) have evolved from text generators into the cognitive core of modern AI systems, their inherent lack of authorization awareness exposes these systems to catastrophic risks, ranging from unintentional data leakage to unauthorized command execution. Existing defense mechanisms are fundamentally decoupled from internal reasoning, rendering them insufficient for the complex security demands of dynamic AI systems. Here, we propose the Chain-of-Authorization (CoA) framework, a paradigm that internalizes access control as a foundational cognitive capability. By systematically redesigning the input-output format and fine-tuning the model on synthesized data with complex permission topologies, CoA forces the model to generate a structured authorization trajectory as a causal prerequisite for any substantive response or action, thereby enabling LLMs to internalize access boundaries within dynamic reasoning environments. CoA maintains high utility in authorized scenarios while achieving high rejection rates of unauthorized prompts and robust defense against diverse adversarial attacks. By embedding authorization directly into the reasoning process, CoA provides a principled architectural blueprint for deploying secure LLMs as the cognitive cores of modern AI systems.

Yang Li, Yule Liu, Xinlei He, Youjian Zhao, Qi Li, Ke Xu• 2026

Related benchmarks

TaskDatasetResultRank
Massive Multitask Language UnderstandingMMLU
Accuracy62.87
117
Question AnsweringSQuAD
Exact Match86.13
83
Hazard Knowledge EvaluationWMDP
Accuracy68.98
18
Mobile Interaction Action PredictionMobile Actions
Accuracy96.09
18
Question AnsweringCovidQA
Accuracy62.14
15
Showing 5 of 5 rows

Other info

Follow for update