Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Robustness of Prompting: Enhancing Robustness of Large Language Models Against Prompting Attacks

About

Large Language Models (LLMs) have demonstrated remarkable performance across various tasks by effectively utilizing a prompting strategy. However, they are highly sensitive to input perturbations, such as typographical errors or slight character order errors, which can significantly impair their performance. Despite advances in prompting techniques such as Chain-of-Thought and automatic prompt generation, developing a prompting strategy that explicitly mitigates the negative impact of such perturbations remains an open challenge. To bridge this gap, we propose Robustness of Prompting (RoP), a novel prompting strategy aimed at enhancing the robustness of LLMs. RoP consists of two stages: Error Correction and Guidance. In the Error Correction stage, RoP applies diverse perturbation methods to generate adversarial examples, which are used to generate prompts that correct input errors automatically. In the Guidance stage, RoP generates an optimal guidance prompt based on the corrected input, guiding the model to generate more robust and accurate inferences. Through comprehensive experiments spanning arithmetic, commonsense, and logical reasoning tasks, we demonstrate that RoP significantly improves LLMs' robustness against adversarial perturbations. Crucially, it preserves model accuracy with only minimal degradation compared to clean input scenarios, thereby establishing RoP as a practical and effective approach for enhancing LLM robustness in real-world applications.

Lin Mu, Guowei Chu, Li Ni, Lei Sang, Yiwen Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Arithmetic ReasoningMultiArith
Accuracy95
293
Arithmetic ReasoningGSM8K
Accuracy75.4
272
Arithmetic ReasoningADDSUB
Accuracy89.9
149
Arithmetic ReasoningSVAMP
Accuracy82.4
87
Arithmetic ReasoningSINGLEEQ
Accuracy95.3
73
Arithmetic ReasoningAQUA
Accuracy70.1
57
Arithmetic ReasoningAverage Arithmetic Reasoning Tasks
Accuracy82.3
26
Medical Visual Question AnsweringSLAKE CT Sparse View 1.0 (test)
Accuracy63.64
5
Visual Question AnsweringSLAKE Add Characters 1.0 (test)
Accuracy73.8
5
Medical Visual Question AnsweringSLAKE CT Low Dose 1.0 (test)
Accuracy65.91
5
Showing 10 of 18 rows

Other info

Follow for update