Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Persuasion Detection on Post-cutoff datasets (test)
Loading...
80.8
F1 Score
GPT 4o mini
47.208
55.929
64.65
73.371
Jun 7, 2025
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
GPT 4o mini
Condition=Averaged ove...
2025.06
80.8
Llama 3.3 70B
Condition=Averaged ove...
2025.06
75.2
Gemini 1.5 Flash
Condition=Averaged ove...
2025.06
71.9
Claude 3 Haiku
Condition=Averaged ove...
2025.06
67.7
Llama 3.1 8B
Condition=Averaged ove...
2025.06
64.9
BERT
Condition=Averaged ove...
2025.06
48.5
Feedback
Search any
task
Search any
task