Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment

About

Existing preference alignment is a one-size-fits-all alignment mechanism, where the part of the large language model (LLM) parametric knowledge with non-preferred features is uniformly blocked to all the users. However, this part of knowledge can be useful to advanced users whose expertise qualifies them to handle these information. The one-size-fits-all alignment mechanism undermines LLM's utility for these qualified users. To address this problem, we propose SudoLM, a framework that lets LLMs learn access control over specific parametric knowledge for users with different credentials via authorization alignment. SudoLM allows authorized users to unlock their access to all the parametric knowledge with an assigned SUDO key while blocking access to non-qualified users. Experiments on two application scenarios demonstrate that SudoLM effectively controls the user's access to the parametric knowledge and maintains its general utility.

Qin Liu, Fei Wang, Chaowei Xiao, Muhao Chen• 2024

Related benchmarks

TaskDatasetResultRank
Multi-task Language UnderstandingMMLU
Accuracy63.9
842
Multi-turn Dialogue EvaluationMT-Bench
Overall Score7.97
331
Scientific ReasoningARC
Score82.3
29
Medical Question AnsweringMedical QA
GPT-4 Score92.5
9
Privileged knowledge recallTOFU
ROUGE-L Recall97.6
9
Access Controlmedical
Accuracy100
5
Access ControlTOFU
Accuracy98.13
5
Showing 7 of 7 rows

Other info

Follow for update