Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BodhiPromptShield: Pre-Inference Prompt Mediation for Suppressing Privacy Propagation in LLM/VLM Agents

About

In LLM/VLM agents, prompt privacy risk propagates beyond a single model call because raw user content can flow into retrieval queries, memory writes, tool calls, and logs. Existing de-identification pipelines address document boundaries but not this cross-stage propagation. We propose BodhiPromptShield, a policy-aware framework that detects sensitive spans, routes them via typed placeholders, semantic abstraction, or secure symbolic mapping, and delays restoration to authorized boundaries. Relative to enterprise redaction, this adds explicit propagation-aware mediation and restoration timing as a security variable. Under controlled evaluation on the Controlled Prompt-Privacy Benchmark (CPPB), stage-wise propagation suppresses from 10.7\% to 7.1\% across retrieval, memory, and tool stages; PER reaches 9.3\% with 0.94 AC and 0.92 TSR, outperforming generic de-identification. These are controlled systems results on CPPB rather than formal privacy guarantees or public-benchmark transfer claims. The project repository is available at https://github.com/mabo1215/BodhiPromptShield.git.

Bo Ma, Jinsong Wu, Weiqi Yan• 2026

Related benchmarks

TaskDatasetResultRank
Privacy Exposure Rate EvaluationCPPB
Privacy Exposure Rate (PER)9.3
7
Downstream Utility PreservationCPPB Downstream Utility
Accuracy (AC)94
7
Privacy MediationCPPB OCR-mediated document inputs
Multimodal PER0.113
4
Sensitive Content Propagation ExposureCPPB Agent Pipeline
Retrieval SPE10.7
4
PII detectionCPPB
Span F192
3
Adversarial RobustnessSurface-form evasion probe suite Homoglyph substitution
Exposure Rate43.9
2
Adversarial RobustnessSurface-form evasion probe suite Paraphrase-sensitive spans
Exposure47.6
2
Adversarial RobustnessSurface-form evasion probe suite Mixed-language mentions
Exposure Rate38.8
2
Adversarial RobustnessSurface-form evasion probe suite Restoration-trigger injection
Exposure Rate58.8
2
Showing 9 of 9 rows

Other info

Follow for update