Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment

About

Existing preference alignment is a one-size-fits-all alignment mechanism, where the part of the large language model (LLM) parametric knowledge with non-preferred features is uniformly blocked to all the users. However, this part of knowledge can be useful to advanced users whose expertise qualifies them to handle these information. The one-size-fits-all alignment mechanism undermines LLM's utility for these qualified users. To address this problem, we propose SudoLM, a framework that lets LLMs learn access control over specific parametric knowledge for users with different credentials via authorization alignment. SudoLM allows authorized users to unlock their access to all the parametric knowledge with an assigned SUDO key while blocking access to non-qualified users. Experiments on two application scenarios demonstrate that SudoLM effectively controls the user's access to the parametric knowledge and maintains its general utility.

Qin Liu, Fei Wang, Chaowei Xiao, Muhao Chen• 2024

Related benchmarks

TaskDatasetResultRank
Multi-task Language UnderstandingMMLU
Accuracy63.9
876
Multi-turn Dialogue EvaluationMT-Bench
Overall Score7.97
447
Massive Multitask Language UnderstandingMMLU
Accuracy38.91
117
Question AnsweringSQuAD
Exact Match68.48
83
Scientific ReasoningARC
Score82.3
29
Mobile Interaction Action PredictionMobile Actions
Accuracy69.06
18
Hazard Knowledge EvaluationWMDP
Accuracy35.24
18
Question AnsweringCovidQA
Accuracy59.04
15
Medical Question AnsweringMedical QA
GPT-4 Score92.5
9
Privileged knowledge recallTOFU
ROUGE-L Recall97.6
9
Showing 10 of 12 rows

Other info

Follow for update