Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DoPE: Decoy Oriented Perturbation Encapsulation Human-Readable, AI-Hostile Documents for Academic Integrity

About

Multimodal Large Language Models (MLLMs) can directly consume exam documents, threatening conventional assessments and academic integrity. We present DoPE (Decoy-Oriented Perturbation Encapsulation), a document-layer defense framework that embeds semantic decoys into PDF/HTML assessments to exploit render-parse discrepancies in MLLM pipelines. By instrumenting exams at authoring time, DoPE provides model-agnostic prevention (stop or confound automated solving) and detection (flag blind AI reliance) without relying on conventional one-shot classifiers. We formalize prevention and detection tasks, and introduce FewSoRT-Q, an LLM-guided pipeline that generates question-level semantic decoys and FewSoRT-D to encapsulate them into watermarked documents. We evaluate on Integrity-Bench, a novel benchmark of 1826 exams (PDF+HTML) derived from public QA datasets and OpenCourseWare. Against black-box MLLMs from OpenAI and Anthropic, DoPE yields strong empirical gains: a 91.4% detection rate at an 8.7% false-positive rate using an LLM-as-Judge verifier, and prevents successful completion or induces decoy-aligned failures in 96.3% of attempts. We release Integrity-Bench, our toolkit, and evaluation code to enable reproducible study of document-layer defenses for academic integrity.

Ashish Raj Shekhar, Shiven Agarwal, Priyanuj Bordoloi, Yash Shah, Tejas Anvekar, Vivek Gupta• 2026

Related benchmarks

TaskDatasetResultRank
AI Assistance PreventionDOPE Exam Dataset
Success Rate0.963
6
DetectionT/F
GPT-5.1 Score (T/F)94.7
5
PreventionMCQ
gpt-5.1 Score99.3
5
PreventionT/F
gpt-5.1 Score100
5
PreventionLongForm
Score (gpt-5.1)100
5
DetectionMCQ
Detection Score99.9
5
DetectionLongForm
Score (gpt-5.1)100
5
Showing 7 of 7 rows

Other info

Follow for update