Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Opinion Alignment on ANES
Loading...
62.33
Mean Accuracy
GRPO
21.406
32.0305
42.655
53.2795
Mar 1, 2026
Mean Accuracy
Updated 3mo ago
Evaluation Results
Method
Method
Links
Mean Accuracy
GRPO
Base model=Magistral 24B
2026.03
62.33
SFT+GRPO
Base model=Magistral 24B
2026.03
62.33
SFT+GRPO
Base model=Llama 3.1 8B
2026.03
59.21
SFT
Base model=Llama 3.1 8B
2026.03
58.82
SFT+GRPO
Base model=Qwen3 8B
2026.03
55.14
GRPO
Base model=Qwen3 8B
2026.03
52.92
SFT
Base model=Magistral 24B
2026.03
52.8
GRPO
Base model=Llama 3.1 8B
2026.03
52.35
SFT
Base model=Qwen3 8B
2026.03
47.01
icl
Base model=Llama 3.1 8B
2026.03
44.39
ORPO
Base model=Llama 3.1 8B
2026.03
44.15
icl
Base model=Magistral 24B
2026.03
42.46
icl
Base model=Qwen3 8B
2026.03
41.32
ORPO
Base model=Qwen3 8B
2026.03
34.07
random
Base model=Untrained b...
2026.03
33.33
ORPO
Base model=Magistral 24B
2026.03
31.32
majority
Base model=Untrained b...
2026.03
22.98
Feedback
Search any
task
Search any
task