SEA-Guard: Culturally Grounded Multilingual Safeguard for Southeast Asia
About
Culturally aware safeguards are crucial for AI alignment in real-world settings, where safety extends beyond common sense and encompasses diverse local values, norms, and region-specific regulations. However, building large-scale, culturally grounded datasets is challenging due to limited resources and a scarcity of native annotators. Consequently, many safeguard models rely on machine translation of English datasets, often missing regional and cultural nuances. We present a novel agentic data-generation framework to scalably create authentic, region-specific safety datasets for Southeast Asia (SEA). On this foundation, we introduce the SEA-Guard family, the first multilingual safeguard models grounded in SEA cultural contexts. Evaluated across multiple benchmarks and cultural variants, SEA-Guard consistently outperforms existing safeguards at detecting regionally sensitive or harmful content while maintaining strong general safety performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Prompt Classification | SEA-SafeguardBench | AUPRC (Average)93.6 | 29 | |
| Prompt Classification | SEA-SafeguardBench English | AUPRC98.9 | 18 | |
| Response Classification | SEA-SafeguardBench CG Cultural | AUPRC (English)75.4 | 16 | |
| Prompt Classification | SEALS (SEA) | AUPRC96.9 | 9 | |
| Vision-text safety classification | VSCBench | AUPRC72.65 | 9 | |
| Vision-text safety classification | VLGuard | AUPRC (Prompt)0.8843 | 9 | |
| Vision-text safety classification | MSSBench Chat | AUPRC (Prompt)52.07 | 9 | |
| Vision-text safety classification | MSSBench Embodied | AUPRC (Prompt)61.97 | 9 | |
| Response Classification | SafeQA English | AUPRC97.5 | 9 | |
| Response Classification | SEA-SafeguardBench | AUPRC89.4 | 9 |