Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Social Reasoning on SIQA
Loading...
15.2
Performance (%)
Autoregressive
8.648
10.349
12.05
13.751
Dec 16, 2025
Performance (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Performance (%)
Autoregressive
Alpha (α)=1, Data repe...
2025.12
15.2
Dual
Alpha (α)=3/4, Data re...
2025.12
14.6
Dual
Alpha (α)=63/64, Data...
2025.12
14.3
Autoregressive
Alpha (α)=1, Data repe...
2025.12
14.2
Dual
Alpha (α)=1/8, Data re...
2025.12
13.3
Autoregressive
Alpha (α)=1, Data repe...
2025.12
8.9
Feedback
Search any
task
Search any
task