Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Persuasion on Cornell CMV, Anthropic and args.me Aggregate
Loading...
24.35
Avg. Agreement Shift (%)
LLaMa3.2 + ToMAP
10.0188
13.7394
17.46
21.1806
May 29, 2025
Avg. Agreement Shift (%)
Updated 1d ago
Evaluation Results
Method
Method
Links
Avg. Agreement Shift (%)
LLaMa3.2 + ToMAP
Size=3B
2025.05
24.35
LLaMa3.2 + RL
Size=3B
2025.05
21.76
Qwen2.5 + ToMAP
Size=3B
2025.05
17.48
Gemma-3
Size=27B
2025.05
17.27
DS-R1
Size=671B
2025.05
17.02
Qwen2.5 + RL
Size=3B
2025.05
14.6
LLaMa3.2 + SFT
Size=3B
2025.05
12.68
GPT-4o
Size=N/A
2025.05
12.54
Qwen2.5 + SFT
Size=3B
2025.05
12.49
LLaMa3.2
Size=3B
2025.05
12.37
Qwen2.5
Size=3B
2025.05
11.76
LLaMa3.1
Size=70B
2025.05
10.57
Feedback
Search any
task
Search any
task