Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Conflict Event Classification on GTD (2,000-row stratified sample)
Loading...
79.7
Accuracy
ConfliBERT
29.156
42.278
55.4
68.522
Mar 10, 2026
Accuracy
Micro F1
Macro F1
Updated 2mo ago
Evaluation Results
Method
Method
Links
Accuracy
Micro F1
Macro F1
ConfliBERT
Type=Fine-tuned, Evalu...
2026.03
79.7
83.1
70.8
Confli-mBERT
Type=Fine-tuned, Evalu...
2026.03
75.8
79.7
58
ConflLlama-Q8
Type=Fine-tuned, Evalu...
2026.03
72.85
75.9
61
Gemini 3 Flash
Type=Commercial API, E...
2026.03
65.85
73
60.2
Claude Haiku 4.5
Type=Commercial API, E...
2026.03
60.95
66.4
52.6
DeepSeek V3.2
Type=Commercial API, E...
2026.03
56.45
66.5
49.2
Qwen2.5
Type=Open-source (loca...
2026.03
51.9
55.5
37.1
Llama3.1
Type=Open-source (loca...
2026.03
31.25
34.8
21.4
Gemma2
Type=Open-source (loca...
2026.03
31.1
33.4
21.5
Feedback
Search any
task
Search any
task