Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
PII (multi-label) on Representative guardrail dataset
Loading...
89
F1 Score
Luna-2
18.28
36.64
55
73.36
Feb 20, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Luna-2
Model=Llama 3.2 3B
2026.02
89
ChainPoll
Model=GPT 4.1
2026.02
88
Label-constrained
Model=Llama 3.2 3B
2026.02
21
Feedback
Search any
task
Search any
task