Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Persuasion on Anthropic Persuasion Dataset Persuadee: LLaMa3.1 OOD
Loading...
18.16
Agreement Shift (%)
LLaMa3.2 + ToMAP
3.3296
7.1798
11.03
14.8802
May 29, 2025
Agreement Shift (%)
Updated 1d ago
Evaluation Results
Method
Method
Links
Agreement Shift (%)
LLaMa3.2 + ToMAP
Size=3B
2025.05
18.16
LLaMa3.2 + RL
Size=3B
2025.05
16.21
Qwen2.5 + RL
Size=3B
2025.05
13.59
Qwen2.5 + ToMAP
Size=3B
2025.05
12.1
LLaMa3.2 + SFT
Size=3B
2025.05
10.2
DS-R1
Size=671B
2025.05
8.79
LLaMa3.2
Size=3B
2025.05
8.78
Qwen2.5 + SFT
Size=3B
2025.05
7.96
Gemma-3
Size=27B
2025.05
7.54
Qwen2.5
Size=3B
2025.05
7.09
LLaMa3.1
Size=70B
2025.05
4.1
GPT-4o
Size=N/A
2025.05
3.9
Feedback
Search any
task
Search any
task