Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MIPIAD: Multilingual Indirect Prompt Injection Attack Defense with Qwen -- TF-IDF Hybrid and Meta-Ensemble Learning

About

Indirect prompt injection remains a persistent weakness in retrieval-augmented and tool-using LLM systems, and the problem becomes harder to characterise in multilingual settings. We present MIPIAD, a defense framework evaluated on English and Bangla that combines a sequence classifier fine-tuned from Qwen2.5-1.5B via LoRA (XLPID), TF-IDF lexical features, and validation-tuned ensembling through late fusion, stacking, and gradient boosting. The framework is evaluated on a synthetic benchmark built from BIPIA(Yi et al., 2023) templates spanning five task families -- email, table, QA, abstract, and code-comprising over 1.43 million generated samples, with train and test splits using mutually exclusive attack categories. Across the experiments, lexical signals prove strong (TF-IDF+SVM F1=0.77), and the hybrid XLPID+TF-IDF ensemble achieves the best overall F1 (0.9205) while the Boosting Ensemble achieves the best AUROC (0.9378). Ensemble methods consistently reduce the English-Bangla cross-lingual gap relative to standalone neural models. The pipeline is designed for extensibility: NLLB-200 supports over 200 languages and XLPID's multilingual backbone can be retargeted to additional languages without architectural changes; empirical validation is currently limited to English and Bangla

Al Muhit Muhtadi, Mostafa Rifat Tazwar• 2026

Related benchmarks

TaskDatasetResultRank
Indirect Prompt Injection DefenseBIPIA English
ASR41.1
16
Indirect Prompt Injection DefenseBIPIA Bangla
ASR45
16
Indirect Prompt Injection DefenseBIPIA Average EN/BN
BU70.3
16
Prompt injection detectionMIPIAD aggregate over English and Bangla (test)
Accuracy (Acc)89
11
Showing 4 of 4 rows

Other info

Follow for update