Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models

About

Large Language Models (LLMs) are increasingly equipped with capabilities of real-time web search and integrated with protocols like Model Context Protocol (MCP). This extension could introduce new security vulnerabilities. We present a systematic investigation of LLM vulnerabilities to hidden adversarial prompts through malicious font injection in external resources like webpages, where attackers manipulate code-to-glyph mapping to inject deceptive content which are invisible to users. We evaluate two critical attack scenarios: (1) "malicious content relay" and (2) "sensitive data leakage" through MCP-enabled tools. Our experiments reveal that indirect prompts with injected malicious font can bypass LLM safety mechanisms through external resources, achieving varying success rates based on data sensitivity and prompt design. Our research underscores the urgent need for enhanced security measures in LLM deployments when processing external content.

Junjie Xiong, Changjia Zhu, Shuhang Lin, Chong Zhang, Yongfeng Zhang, Yao Liu, Lingyao Li• 2025

Related benchmarks

Task	Dataset	Result
Watermarking Prevention	Ten-exam benchmark 1.0 (test)	Prevention ASR0.819	20
Watermark Detection	Ten-exam benchmark 1.0 (test)	Detection Score85.9	20
Detection	MCQ	Detection Score100	5
Detection	LongForm	Score (gpt-5.1)100	5
Prevention	T/F	gpt-5.1 Score86.4	5
Prevention	LongForm	Score (gpt-5.1)87.5	5
Detection	T/F	GPT-5.1 Score (T/F)70.5	5
Prevention	MCQ	gpt-5.1 Score84	5

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord